Skip to content
adatum
  •  Learn Azure Bicep
  •  SCOM Web API
  •  About adatum
Azure

How to move Azure blobs up the path

  • 25/01/202225/01/2022
  • by Martin Ehrnst

This is the short, but for me, pretty intense story from when I uploaded 900 blobs of one Gb each to the wrong path in a storage container. Eventually I was able to move these files using azcopy and PowerShell

Thanos persistent Prometheus metrics

In our Azure Kubernetes environment (AKS) we use Prometheus and Thanos for application metrics. Thanos allow us to use Azure Storage for long term retention and high availability Prometheus setup. The other day I was challenged with deleting a series of metrics causing high cardinality. Meaning that a lot of new series of data was written due to a parameter being inserted during scraping.

The way Thanos works is that it takes raw prometheus data, downsamples it and upload it to Azure Storage for long term retention. Each time this process runs, it will create a new blob. In our production environment we had around 900 blobs and 900gb of data.

Thanos has a built in tool to rewrite these blobs and remove the metric we wanted, which seemed easy enough to do, but we had no idea when the problem first started, so I had to analyze, rewrite and upload all the data. It all seemed to work fine, util I discovered no metrics where available. It turned out that the tool I used inherited my local path and uploaded all the modified data to <guid>/chunks/c:/users/appdata/local/[...]/00001.blob

So no matter how satisfied I was, all the data was useless as thanos expected the files to be under <guid>/chunks/00001. On the bright side, all data was there, so the challenge was to move the files from <guid>/chunks/c:/users/appdata/local/[...] to <guid>/chunks/. From the two pictures below you can see the folder structure. Going trough a download and upload approach was the last thing I wanted to do.

Azure storage explorer

AzCopy and PowerShell to the rescue

I already knew my way around azcopy. But I did not know the process actually run on the Azure backbone if you copy within or between storage accounts. Luckily my dear Twitter friends was there to help where I failed to read the documentation.

To perform the copy operation I used a combination of Azure Powershell and AzCopy.

  • Connect
  • Get all current blobs
  • Filter them
  • Actually copy
  • Second loop to delete

Below is my complete script. This could be way smarter but I quickly put it together to get the job done.

## connect to storage using SAS
$storageName = ""
$sasToken = ""
$container = ""
$ctx = New-AzureStorageContext –StorageAccountName $storageName –SasToken $sasToken
# get all the blobs
$blobs = Get-AzureStorageBlob –Container $container –Context $ctx
# a date to filter on
$date = (get-date –Date 20.01.2022 –AsUTC)
# filter the blobs for date and where name has /c:/..
$blobsToModify = $blobs | where { ($_.LastModified.DateTime -ge $date) -and ($_.LastModified.DateTime -le $date.AddHours(24)) -and ($_.Name -like "*/chunks/C:/Users/*") }
# loop through the blobs
# get the original folder name and the blob name with some splitting
foreach ($blob in $blobsToModify) {
$blobtoMove = $blob.name
$original = $blob.Name.split("/",2)[0] # trim to original name
$newBlob = $blob.Name.split("/")[-1] # trim to original chunk name
# actually copy
./azcopy.exe copy "https://$storageName.blob.core.windows.net/thanos/$blobToMove$sasToken" "https://$storageName.blob.core.windows.net/thanos/$original/chunks/$newBlob$sasToken" —overwrite=prompt —s2s–preserve–access–tier=false —include–directory–stub=false —recursive —log–level=INFO;
}
# antother loop to delete the whole c:/ folder after the chunks of data is moved
# i have a separate loop as there might be multiple chunks in the folder.
foreach ($blob in $blobsToModify) {
$original = $blob.Name.split("/",2)[0] # trim to original name
./azcopy.exe remove "https://$storageName.blob.core.windows.net/thanos/$original/chunks/C%3A/$sasToken" —from–to=BlobTrash —recursive —log–level=INFO;
}
view raw azcopy-move.ps1 hosted with ❤ by GitHub

Summary

I hope this helps someone else who accidentaly upload a lot of data to the wrong place. If you by any chance are using Thanos. I filed this as a bug.

Share this:

  • LinkedIn
  • Twitter
  • Reddit

Top Posts & Pages

  • Azure Application registrations, Enterprise Apps, and managed identities
  • Automate Azure DevOps like a boss
  • Multi subscription deployment with DevOps and Azure Lighthouse
  • Creating Azure AD Application using Powershell
  • Azure token from a custom app registration
  • Script to add SCOM agent management group
  • Azure AD authentication in Azure Functions
  • How to move Azure blobs up the path
  • Track changes to Azure resources
  • Azure Bicep modules, variables, and T-shirt sizing

Tags

agent announcements api ARM authoring Automation azcopy Azure AzureAD Azure Bicep AzureDevOps AzureFunctions AzureLighthouse AzureManagement AzureMonitor AzureSpringClean Bicep Community CSP database EventGrid healthservicestore IaC Infrastructure as code Integrations logs management pack Microsoft Build Microsoft Partner monitoring MSIgnite MSOMS MSP nicconf Nordic Virtual Summit OperationsManager OpsMgr Powershell QUickPublish rest SCOM Serverless SquaredUP SysCtr system center

Follow Martin Ehrnst

  • Twitter
  • LinkedIn

RSS feed RSS - Posts

RSS feed RSS - Comments

Microsoft Azure MVP

Martin Ehrnst Microsoft Azure MVP
Adatum.no use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it. Cookie Policy
Theme by Colorlib Powered by WordPress