Azure

How to move Azure blobs up the path

This is the short, but for me, pretty intense story from when I uploaded 900 blobs of one Gb each to the wrong path in a storage container. Eventually I was able to move these files using azcopy and PowerShell

Thanos persistent Prometheus metrics

In our Azure Kubernetes environment (AKS) we use Prometheus and Thanos for application metrics. Thanos allow us to use Azure Storage for long term retention and high availability Prometheus setup. The other day I was challenged with deleting a series of metrics causing high cardinality. Meaning that a lot of new series of data was written due to a parameter being inserted during scraping.

The way Thanos works is that it takes raw prometheus data, downsamples it and upload it to Azure Storage for long term retention. Each time this process runs, it will create a new blob. In our production environment we had around 900 blobs and 900gb of data.

Thanos has a built in tool to rewrite these blobs and remove the metric we wanted, which seemed easy enough to do, but we had no idea when the problem first started, so I had to analyze, rewrite and upload all the data. It all seemed to work fine, util I discovered no metrics where available. It turned out that the tool I used inherited my local path and uploaded all the modified data to <guid>/chunks/c:/users/appdata/local/[...]/00001.blob

So no matter how satisfied I was, all the data was useless as thanos expected the files to be under <guid>/chunks/00001. On the bright side, all data was there, so the challenge was to move the files from <guid>/chunks/c:/users/appdata/local/[...] to <guid>/chunks/. From the two pictures below you can see the folder structure. Going trough a download and upload approach was the last thing I wanted to do.

Azure storage explorer

AzCopy and PowerShell to the rescue

I already knew my way around azcopy. But I did not know the process actually run on the Azure backbone if you copy within or between storage accounts. Luckily my dear Twitter friends was there to help where I failed to read the documentation.

To perform the copy operation I used a combination of Azure Powershell and AzCopy.

  • Connect
  • Get all current blobs
  • Filter them
  • Actually copy
  • Second loop to delete

Below is my complete script. This could be way smarter but I quickly put it together to get the job done.

## connect to storage using SAS
$storageName = ""
$sasToken = ""
$container = ""
$ctx = New-AzureStorageContext StorageAccountName $storageName SasToken $sasToken
# get all the blobs
$blobs = Get-AzureStorageBlob Container $container Context $ctx
# a date to filter on
$date = (get-date Date 20.01.2022 AsUTC)
# filter the blobs for date and where name has /c:/..
$blobsToModify = $blobs | where { ($_.LastModified.DateTime -ge $date) -and ($_.LastModified.DateTime -le $date.AddHours(24)) -and ($_.Name -like "*/chunks/C:/Users/*") }
# loop through the blobs
# get the original folder name and the blob name with some splitting
foreach ($blob in $blobsToModify) {
$blobtoMove = $blob.name
$original = $blob.Name.split("/",2)[0] # trim to original name
$newBlob = $blob.Name.split("/")[-1] # trim to original chunk name
# actually copy
./azcopy.exe copy "https://$storageName.blob.core.windows.net/thanos/$blobToMove$sasToken" "https://$storageName.blob.core.windows.net/thanos/$original/chunks/$newBlob$sasToken" overwrite=prompt s2spreserveaccesstier=false includedirectorystub=false recursive loglevel=INFO;
}
# antother loop to delete the whole c:/ folder after the chunks of data is moved
# i have a separate loop as there might be multiple chunks in the folder.
foreach ($blob in $blobsToModify) {
$original = $blob.Name.split("/",2)[0] # trim to original name
./azcopy.exe remove "https://$storageName.blob.core.windows.net/thanos/$original/chunks/C%3A/$sasToken" fromto=BlobTrash recursive loglevel=INFO;
}
view raw azcopy-move.ps1 hosted with ❤ by GitHub

Summary

I hope this helps someone else who accidentaly upload a lot of data to the wrong place. If you by any chance are using Thanos. I filed this as a bug.

Engage by commenting