Efficient File Management on Azure Blob Storage: CRUD Operations and Upload Strategies

Introduction

Azure Blob Storage is a scalable, secure, and cost-effective solution for storing unstructured data such as images, videos, and documents. Efficiently managing files—including creating, reading, updating, and deleting (CRUD) operations—is crucial for building robust cloud-based applications.

In this article, we’ll dive into upload strategies such as IFormFile, chunked uploads, and streams. By the end, you’ll have the tools to optimize performance, minimize memory usage, and handle file management with confidence.


1. Understanding the Concept

What It Is

Azure Blob Storage is a Microsoft-managed service designed to store large amounts of unstructured data in the cloud. It offers three types of blobs:

  1. Block blobs: For storing text and binary data.
  2. Append blobs: Optimized for append operations, such as logging.
  3. Page blobs: For random read/write operations, often used in virtual disks.

Why It Matters

Efficient file management on Azure Blob Storage is essential for applications requiring high availability, durability, and scalability. This includes scenarios like:

  • Serving media files in a content delivery network (CDN).
  • Backing up and archiving critical data.
  • Storing and processing large datasets for analytics.

Real-World Use Cases

Azure Blob Storage is a scalable, secure, and cost-effective solution for storing unstructured data. Its use cases include hosting static website assets, media streaming, IoT data archiving, e-commerce content management, and healthcare imaging. It supports backups, disaster recovery, and big data workflows, such as analytics and machine learning. Azure Blob is ideal for storing enterprise documents, geospatial data, gaming assets, and regulatory archives. Its versatility extends to scientific research, education, and software development. With features like global redundancy and robust security, it serves industries like retail, logistics, and AI applications effectively.


2. Step-by-Step Implementation

2.1. Prerequisites

Tools and Frameworks

  • Azure account with a Blob Storage resource.
  • Azure Storage SDK for your programming language (e.g., .NET, Python).
  • Development environment (e.g., Visual Studio, VS Code).
  • Basic knowledge of REST APIs and cloud storage.

Setup Instructions

  1. Create an Azure Blob Storage account via the Azure Portal.
  2. Install the Azure Storage SDK for your programming language:dotnet add package Azure.Storage.Blobs
  3. Retrieve your storage account connection string from the Azure Portal.

2.2. Code Walkthrough

Establishing a Connection

    private readonly BlobServiceClient _blobServiceClient;

    public BlobStorageService(IConfiguration configuration)
    {
        var connectionString = configuration.GetConnectionString("AzureStorage");
        _blobServiceClient = new BlobServiceClient(connectionString);
    }

Creating a Container

      var containerClient = _blobServiceClient.GetBlobContainerClient(containerName);
        await containerClient.CreateIfNotExistsAsync();

2.2.1 Uploading a File

2.2.1.1 Using IFormFile

Description:

  • IFormFile is a high-level abstraction in ASP.NET Core used for handling file uploads via the multipart/form-data encoding.
  • It provides an in-memory or temporary storage solution for files uploaded in a form.

How It Works:

  • The entire file is either buffered in memory or saved to a temporary location (e.g., disk) before processing.
  • The framework manages file processing through model binding.

Key Features:

  • File is accessed via properties like FileName, Length, and OpenReadStream().

Example:

    [HttpPost("upload")] // POST api/blob/upload
    [Consumes("multipart/form-data")]
    public async Task<IActionResult> UploadBlobAsync([FromForm] BlobUploadRequest request)
    {
        using var stream = request.File.OpenReadStream();
        await _blobStorageService.UploadBlobAsync(request.ContainerName, request.BlobName, stream);
        return Ok();
    }
public class BlobUploadRequest
{
    public string ContainerName { get; set; }
    public string BlobName { get; set; }
    public IFormFile File { get; set; }
}
    public async Task UploadBlobAsync(string containerName, string blobName, Stream content)
    {
        if (!IsValidContainerName(containerName))
        {
            throw new ArgumentException("Invalid container name.");
        }
        string blobNameWithTimestamp = GenerateBlobNameWithTimestamp(blobName);

        var containerClient = _blobServiceClient.GetBlobContainerClient(containerName);
        await containerClient.CreateIfNotExistsAsync();

        var blobClient = containerClient.GetBlobClient(blobNameWithTimestamp);
        await blobClient.UploadAsync(content, overwrite: true);
    }

Pros:

  • Ease of Use: Very easy to implement and integrates well with form submissions.
  • Validation: Supports model validation (e.g., checking file size or type).
  • Convenience: Provides metadata like the file name and content type.

Cons:

  • Memory Usage: For large files, IFormFile can use a lot of memory because files are buffered in memory or temporarily saved on disk.
  • Not Suitable for Large Files: Uploading very large files can cause performance issues or out-of-memory exceptions.

Best For:

  • Small to medium-sized files.
  • Scenarios where file metadata is included in a form.

2.2.1.2 Using Chunked Uploads

Description:

  • Chunked uploads break a large file into smaller parts (chunks) and upload them sequentially or in parallel.
  • Each chunk is processed independently and combined later at the destination.

How It Works:

  • A client splits the file into chunks of a specified size.
  • Each chunk is sent in a separate HTTP request (e.g., using a REST API or WebSocket).
  • The server reassembles the chunks into the original file.

Key Features:

  • Typically involves metadata (e.g., file ID, chunk index, and total chunks) for reassembly.

Example:

[HttpPost("upload-chunk")]
 public async Task<IActionResult> UploadChunk(
        [FromForm] IFormFile chunk,
        [FromForm] string containerName,
        [FromForm] string blobName,
        [FromForm] int chunkIndex,
        [FromForm] int totalChunks)
    {
        if (chunk == null || chunk.Length == 0)
        {
            return BadRequest(new { Message = "Chunk is missing or empty." });
        }

        try
        {
            // Stream the chunk to the service
            using var stream = chunk.OpenReadStream();
            await _blobStorageService.UploadChunkAsync(containerName, blobName, stream, chunkIndex, totalChunks);

            return Ok(new { Message = $"Chunk {chunkIndex + 1}/{totalChunks} uploaded successfully." });
        }
        catch (Exception ex)
        {
            return StatusCode(500, new { Message = "Error uploading chunk.", Error = ex.Message });
        }
}
    public async Task UploadChunkAsync(string containerName, string blobName, Stream chunkData, int chunkIndex, int totalChunks)
    {
        if (!IsValidContainerName(containerName))
        {
            throw new ArgumentException("Invalid container name.");
        }

        // Get the container client
        var containerClient = _blobServiceClient.GetBlobContainerClient(containerName);
        await containerClient.CreateIfNotExistsAsync();

        // Get the block blob client
        var blockBlobClient = containerClient.GetBlockBlobClient(blobName);

        // Generate a unique block ID for each chunk
        var blockId = Convert.ToBase64String(Encoding.UTF8.GetBytes(chunkIndex.ToString("d6")));

        // Stage the chunk as a block
        await blockBlobClient.StageBlockAsync(blockId, chunkData);

        // If this is the last chunk, commit the block list
        if (chunkIndex + 1 == totalChunks)
        {
            // Create a list of block IDs
            var blockList = Enumerable.Range(0, totalChunks)
                .Select(index => Convert.ToBase64String(Encoding.UTF8.GetBytes(index.ToString("d6"))))
                .ToList();

            // Commit the block list to assemble the final blob
            await blockBlobClient.CommitBlockListAsync(blockList);
        }
    }

Pros:

  • Scalability: Allows uploading very large files without overwhelming server resources.
  • Fault Tolerance: If a chunk fails to upload, only that chunk needs to be retried.
  • Parallel Uploads: Improves upload speed by uploading multiple chunks simultaneously.

Cons:

  • Complexity: Requires additional logic to manage chunk reassembly and error handling.
  • Metadata Overhead: Involves sending metadata (e.g., file ID, chunk index) with each chunk.
  • Latency: Sequential uploads might have higher latency.

Best For:

  • Very large files where streaming isn’t practical.
  • Scenarios requiring resumable uploads or fault tolerance.

2.2.1.3 Using Streams

This method streams file content directly from the HTTP request body (Request.Body) to the server or Azure Blob Storage, avoiding the need to fully load the file into memory. It is particularly useful for efficiently handling large file uploads.

How It Works:

  • The server does not buffer the entire file in memory, making it memory-efficient.
  • Ideal for scenarios where the server does not need to process the file content directly but simply stores or forwards it.

Key Features:

  • Direct Streaming: Access file content directly from the request body as a stream.
  • Memory Efficiency: Suitable for large files, as it avoids loading the entire file into memory.
  • Customizable: Allows the addition of metadata via headers or query parameters.

Example:

Controller:

    [HttpPost("stream-upload")]
    public async Task<IActionResult> StreamUploadAsync()
    {
        try
        {
            var containerName = Request.Headers["Container-Name"].ToString();
            var blobName = Request.Headers["Blob-Name"].ToString();

            if (string.IsNullOrEmpty(containerName) || string.IsNullOrEmpty(blobName))
            {
                return BadRequest(new { Message = "Container-Name and Blob-Name headers are required." });
            }

            // Stream data from the client directly to Azure Blob Storage
            using var stream = Request.Body;

            // Infer the Content-Type

            await _blobStorageService.UploadBlobAsync(containerName, blobName, stream);

            return Ok(new { Message = "File uploaded successfully." });
        }
        catch (Exception ex)
        {
            return StatusCode(500, new { Message = "Error uploading file.", Error = ex.Message });
        }
    }

Service:

    public async Task UploadBlobAsync(string containerName, string blobName, Stream content)
    {
        if (!IsValidContainerName(containerName))
        {
            throw new ArgumentException("Invalid container name.");
        }
        string blobNameWithTimestamp = GenerateBlobNameWithTimestamp(blobName);

        var containerClient = _blobServiceClient.GetBlobContainerClient(containerName);
        await containerClient.CreateIfNotExistsAsync();

        var blobClient = containerClient.GetBlobClient(blobNameWithTimestamp);
        await blobClient.UploadAsync(content, overwrite: true);
    }

Explanation

  1. Headers for Metadata:
    • The Container-Name and Blob-Name headers allow you to specify the storage location and file name. These must be passed explicitly as streaming does not include metadata by default.
  2. Streaming Upload:
    • The file is streamed directly from Request.Body to the blob storage, avoiding intermediate buffering and reducing memory usage.
  3. Custom Blob Names:
    • A timestamp can be appended to the blob name using GenerateBlobNameWithTimestamp to ensure uniqueness and prevent overwriting files unintentionally.

Pros:

  • Memory Efficient: Suitable for large files because it avoids buffering the entire file in memory.
  • Direct Processing: Allows streaming directly to storage or a processing pipeline.
  • Scalability: Handles large files without performance degradation.

Cons:

  • No Metadata by Default: File metadata (e.g., name, type) must be passed separately (e.g., via headers or query parameters).
  • Requires Custom Logic: Less out-of-the-box support compared to IFormFile.
  • Streaming Restrictions: Some operations (e.g., seeking) are not available.

Best For:

  • Uploading very large files.
  • Scenarios requiring minimal memory usage and direct streaming.

Comparison Table

FeatureIFormFileChunked UploadsStream
Ease of UseSimple and high-levelMore complexRequires manual handling
Memory UsageHigher (buffered in memory/disk)Moderate (depends on chunk size)Very low (direct streaming)
ScalabilitySuitable for small/medium filesHigh (handles large files well)High (handles very large files)
Fault ToleranceLimitedHigh (retry specific chunks)Limited (relies on full stream)
Best ForSmall/medium filesLarge files needing resumabilityVery large files
Metadata HandlingBuilt-in (file name, size, etc.)Requires additional fieldsRequires headers/query params
Use CaseForm-based uploadsResumable, fault-tolerant uploadsLarge file streaming directly

When to Use Each Strategy

  • IFormFile: Use for simple uploads where file size is small to medium, and you want convenience and ease of use.
  • Chunked Uploads: Use when uploading large files that need resumability or fault tolerance (e.g., poor network conditions).
  • Stream: Use for very large files when you need efficient, direct streaming to storage without memory overhead.

2.2.2 Listing All Files and Retrieving File Content with SAS

2.2.2.1 List All Files from a Container (File URL with SAS Token )

This method retrieves all blobs (files) in a specified Azure Blob Storage container. Optionally, you can filter by a folder path and include a Shared Access Signature (SAS) URI for direct access to the files.

Controller Example:

    [HttpGet("list")] // GET api/blob/list
    public async Task<IActionResult> GetAllBlobsAsync(string containerName, string path = null)
    {
        var blobs = await _blobStorageService.GetAllBlobsAsync(containerName, path, true, DateTimeOffset.UtcNow.AddHours(1));
        return Ok(blobs);
    }

Service Example:

    public async Task<List<BlobDetails>> GetAllBlobsAsync(string containerName, string path = null, bool includeSasUri = false, DateTimeOffset? sasExpiryTime = null)
    {
        var containerClient = _blobServiceClient.GetBlobContainerClient(containerName);
        var blobs = containerClient.GetBlobsAsync(prefix: path);
        var blobDetailsList = new List<BlobDetails>();


        await foreach (var blobItem in blobs)
        {
            var blobDetails = new BlobDetails
            {
                Name = blobItem.Name,
                CreatedOn = blobItem.Properties.CreatedOn,
                Metadata = blobItem.Metadata,
            };

            if (includeSasUri && sasExpiryTime.HasValue)
            {
                blobDetails.SasUri = await GetBlobSasUriAsync(containerName, blobItem.Name, sasExpiryTime.Value);
            }

            blobDetailsList.Add(blobDetails);
        }

        return blobDetailsList;
    }

Explanation:

Parameters:

  • containerName: The name of the Azure Blob Storage container from which blobs are listed.
  • path: Optional folder path for filtering blobs.
  • includeSasUri: If true, includes a SAS token URI for each blob.
  • sasExpiryTime: Specifies how long the SAS token will remain valid.

Behavior:

  • The method fetches blobs asynchronously using GetBlobsAsync, which supports efficient streaming of large lists of blobs.
  • Each blob’s details, including its name, creation time, and metadata, are stored in the BlobDetails object.
  • If SAS URIs are requested, the method generates them for secure, time-limited access to the files.

Response Object: The BlobDetails object contains:

  • Name: The name of the blob.
  • CreatedOn: The creation timestamp of the blob.
  • Metadata: Custom metadata associated with the blob.
  • SasUri (optional): A time-bound URI for direct file access.

Security:

  • SAS URIs are generated only if explicitly requested via the includeSasUri parameter.
  • The sasExpiryTime ensures the SAS token is valid only for a limited duration, reducing potential misuse.

Use Case

  • File Browsing: Ideal for listing all files in a container or a specific folder, with the option to generate secure links for download or sharing.
  • Dynamic Access: SAS tokens allow users to access files securely without exposing storage credentials.
  • Metadata Retrieval: Useful for displaying metadata or auditing file properties.

2.2.2.2 Retrieve File Content (Stream)

To retrieve a blob’s content, you can stream the file directly from Azure Blob Storage to the client:

Controller Example:

    [HttpGet("download")] // GET api/blob/download
    public async Task<IActionResult> DownloadBlobAsync(string containerName, string blobName)
    {
        var stream = await _blobStorageService.DownloadBlobAsync(containerName, blobName);
        return File(stream, "application/octet-stream", blobName);
    }

Service Example

   public async Task<Stream> DownloadBlobAsync(string containerName, string blobName)
    {
        var containerClient = _blobServiceClient.GetBlobContainerClient(containerName);
        var blobClient = containerClient.GetBlobClient(blobName);
        var downloadInfo = await blobClient.DownloadAsync();
        return downloadInfo.Value.Content;
    }

Explanation:

  • Stream-Based Retrieval: This approach allows you to stream the file directly to the client without first saving it to the server’s local storage.
  • Secure Access: The method ensures secure access by leveraging Azure SDK’s built-in authentication mechanisms, such as Managed Identity or connection strings, without exposing storage credentials.
  • Content-Type Handling: application/octet-stream is used to indicate a binary file. This can be adjusted based on the file type (e.g., text/plain for text files, image/jpeg for images).

Use Case:

  • Ideal for File Sharing: Suitable for scenarios where files need to be shared or downloaded securely by authorized clients without exposing storage credentials.
  • Efficient for Large Files: Streaming avoids loading the entire file into memory, making it efficient for large file downloads.

2.2.3 UI Implementation

Let’s build a file upload interface in Angular to interact with the backend services for blob uploads. This interface supports:

  • Drag-and-drop file selection.
  • Chunked uploads for large files.
  • Progress tracking.

Below is the updated technical article with the UI Implementation section added to guide users on building the Angular-based file upload interface.


Discover more from Nitin Singh

Subscribe to get the latest posts sent to your email.

Leave a Reply