Aws Sqs Extended Client S3 Multipart Upload

Screencast

In this commodity nosotros come across how to store and retrieve files on AWS S3 using Elixir and with the assistance of ExAws. (If you want to use ExAws with DigitalOcean Spaces instead, you tin can read ExAws with DigitalOcean Spaces)

We get-go by setting upwardly an AWS account and credentials, configure an Elixir application and see the basic upload and download operations with pocket-size files.

And then, we see how to deal with large files, making multipart uploads and using presigned urls to create a download stream, processing information on the fly.

Create an IAM user, configure permissions and credentials

If you don't have an Amazon Web Services business relationship still, you tin can create it on https://aws.amazon.com/ and utilize the complimentary tier for the first 12 months where y'all take up to 5GB of gratuitous S3 storage.

Be sure you bank check all the limits of the free tier earlier yous start using the service. Always take a wait at the billing page to go along runway of the usage.

To admission to AWS S3 resources, showtime we need to create an AWS IAM (Identity and Access Direction) user with express permissions.

Once logged into the AWS console, go to the users department of the security credentials page and click on Add user.

Menu on the top-right side of the AWS console
Menu on the top-right side of the AWS console
Create a new IAM user
Create a new IAM user

When creating a user, nosotros need to prepare a username and virtually chiefly enable the Programmatic admission: this means the user can programmatically access to the AWS resource via API.

Username and Programmatic access
Username and Programmatic access

And so we prepare the permissions, attaching the AmazonS3FullAccess policy and limiting the user to simply the S3 service.

AmazonS3FullAccess policy
AmazonS3FullAccess policy

Now, this policy is fine for this demo, simply it's however too broad: a user, or an app, can access to all the buckets, files and settings of S3.

By creating a custom policy, nosotros can limit the user permissions to only the needed S3 actions and buckets. More on this at AWS User Policy Examples

Once the user is created, we can download the Access Key Id and the Cloak-and-dagger Access Key. You must continue these keys secret considering whoever has them can admission to your AWS S3 resource.

IAM user Access key ID and Secret access key
IAM user Access key ID and Secret admission key

To create an S3 bucket using the AWS console, go to the S3 department and click on Create bucket, set a bucket name (I've used poeticoding-aws-elixir) and be certain to cake all the public access.

Bucket name and region
Saucepan name and region
Block all public access
Block all public admission

Configure ex_aws and environs variables

Allow'southward create a new Elixir application and add the dependencies to make ex_aws and ex_aws_s3 work

            # mix.exs def deps do   [     {:ex_aws, "~> 2.1"},     {:ex_aws_s3, "~> 2.0"},     {:hackney, "~> 1.15"},     {:sweet_xml, "~> 0.6"},     {:jason, "~> 1.1"},   ] end                      

ExAws, by default, uses hackney HTTP Customer to make requests to AWS.

Nosotros create the config/config.exs configuration file, where nosotros ready access id and secret access keys

            # config/config.exs  import Config  config :ex_aws,   json_codec: Jason,   access_key_id: {:organisation, "AWS_ACCESS_KEY_ID"},   secret_access_key: {:system, "AWS_SECRET_ACCESS_KEY"}                      

The default ExAws JSON codec is Poison. If nosotros desire to employ another library, like Jason, we need to explicitly fix the jason_codec property.

Nosotros don't want to write our keys in the configuration file. First, because who has access to the lawmaking can run into them, 2d because we want to make them easy to change.

We can use environs variables: by passing {:arrangement, "AWS_ACCESS_KEY_ID"} and {:system, "AWS_SECRET_ACCESS_KEY"} tuples the application gets the keys from the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY surroundings variables.

In case you lot are on a Unix/Unix-like arrangement (similar MacOs or Linux), you lot can fix these surroundings variables in a script

            # .env file consign AWS_ACCESS_KEY_ID="your access key" export AWS_SECRET_ACCESS_KEY="your clandestine admission key"                      

and load them with source

            $ source .env $ iex -South mix                      

Keep this script underground. If you are using git think to put this script into .gitignore to avert to commit this file.

If you don't want to keep these keys in a script, you can ever laissez passer them when launching the awarding or iex

            $ AWS_ACCESS_KEY_ID="..." \   AWS_SECRET_ACCESS_KEY="..." \   iex -S mix                      

In case you lot're on a Windows motorcar, you can set the environment variables using the command prompt or the PowerShell

            # Windows CMD set AWS_ACCESS_KEY_ID="..."  # Windows PowerShell $env:AWS_ACCESS_KEY_ID="..."                      

List the buckets

Now we have everything prepare: credentials, application dependencies and ex_aws configured with environment variables. And then permit's try the first request.

            # load the surround variables $ source .env  # run iex $ iex -South mix  iex> ExAws.S3.list_buckets() %ExAws.Functioning.S3{   http_method: :get,   parser: &ExAws.S3.Parsers.parse_all_my_buckets_result/1,   path: "/",   service: :s3,   ..., }                      

The ExAws.S3.list_buckets() part doesn't transport the request itself, it returns an ExAws.Operation.S3 struct. To make a request we employ ExAws.request or ExAws.asking!

            iex> ExAws.S3.list_buckets() |> ExAws.request!()  %{   body: %{     buckets: [       %{         creation_date: "2019-11-25T17:48:sixteen.000Z",         name: "poeticoding-aws-elixir"       }     ],     owner: %{ ... }   },   headers: [     ...     {"Content-Type", "awarding/xml"},     {"Transfer-Encoding", "chunked"},     {"Server", "AmazonS3"},     ...   ],   status_code: 200 }                      

ExAws.request! returns a map with the HTTP response from S3. With get_in/two we can get just the bucket list

            ExAws.S3.list_buckets() |> ExAws.request!() |> get_in([:body, :buckets])  [%{creation_date: "2019-11-25T17:48:16.000Z", proper name: "poeticoding-aws-elixir"}]                      

put, listing, get and delete

With ExAws, the easiest manner to upload a file to S3 is with ExAws.S3.put_object/iv

            iex> local_image = File.read!("elixir_logo.png") <<137, eighty, 78, 71, 13, 10, 26, 10, 0, 0, ...>>      iex> ExAws.S3.put_object("poeticoding-aws-elixir", "images/elixir_logo.png", local_image) \ ...> |> ExAws.request!()  %{   trunk: "",   headers: [...],   status_code: 200 }                      

The first argument is the bucket name, then we pass the object key (the path) and the third is the file's content, local_image. As a fourth argument we can laissez passer a listing of options like storage class, meta, encryption etc.

Using the AWS management panel, on the S3 saucepan'south page, we can encounter the file we've just uploaded.

Uploaded file visible on AWS Management console
Uploaded file visible on AWS Management console

We list the bucket'south objects with ExAws.S3.list_objects

            iex> ExAws.S3.list_objects("poeticoding-aws-elixir") \ ...> |> ExAws.asking!() \ ...> |> get_in([:body, :contents]) \  [   %{     e_tag: "\"...\"",     key: "images/elixir_logo.png",     last_modified: "2019-11-26T14:xl:34.000Z",     possessor: %{ ... }     size: "29169",     storage_class: "STANDARD"   } ]                      

Passing the bucket name and object key to ExAws.S3.get_object/2, we get the file's content.

            iex> resp = ExAws.S3.get_object("poeticoding-aws-elixir", "images/elixir_logo.png") \ ...> |> ExAws.request!()      %{   body: <<137, lxxx, 78, 71, 13, 10, 26, ...>>,   headers: [     {"Last-Modified", "Tue, 26 November 2019 14:twoscore:34 GMT"},     {"Content-Type", "application/octet-stream"},     {"Content-Length", "29169"},     ...   ],   status_code: 200 }                      

The request returns a response map with the whole file's content in :torso.

            iex> File.read!("elixir_logo.png") == resp.body true                      

We can delete the object with ExAws.S3.delete_object/two.

            iex> ExAws.S3.delete_object("poeticoding-aws-elixir", "images/elixir_logo.png") \ ...> |> ExAws.asking!()  %{   body: "",   headers: [     {"Date", "Tue, 26 November 2019 15:04:35 GMT"},     ...   ],   status_code: 204 }                      

After listing again the objects we encounter, as expected, that the bucket is now empty.

            iex> ExAws.S3.list_objects("poeticoding-aws-elixir")  ...> |> ExAws.request!()  ...> |> get_in([:torso, :contents])  []                      

Multipart upload and large files

The epitome in the instance above is simply ~30 KB and nosotros can only apply put_object and get_object to upload and download it, but there are some limits:

  • with these 2 functions the file is fully kept in retentiveness, for both upload and download.
  • put_object uploads the file in a unmarried operation and we can upload only objects up to 5 GB in size.

S3 and ExAws client support multipart uploads. It means that a file is divided in parts (5 MB parts by default) which are sent separately and in parallel to S3! In case the office's upload fails, ExAws retries the upload of that 5 MB part only.

With multipart uploads nosotros can upload objects from 5 MB to v TB – ExAws uses file streams, fugitive to keep the whole file in memory.

Permit's consider numbers.txt, a relatively large txt file we've already seen in another commodity – Elixir Stream and large HTTP responses: processing text ( y'all can download from this url https://www.poeticoding.com/downloads/httpstream/numbers.txt).

numbers.txt size is 125 MB, much smaller than the 5GB limit imposed past the single PUT operation. But to me this file is large plenty to benefit from a multipart upload!

            iex> ExAws.S3.Upload.stream_file("numbers.txt") \ ...> |> ExAws.S3.upload("poeticoding-aws-elixir", "numbers.txt") \ ...> |> ExAws.asking!()  # returned response %{   body: "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n\n<CompleteMultipartUploadResult>...",   headers: [     {"Date", "Tue, 26 Nov 2019 xvi:34:08 GMT"},     {"Content-Type", "application/xml"},     {"Transfer-Encoding", "chunked"},   ],   status_code: 200 }                      
  • Starting time we create a file stream with ExAws.S3.Upload.stream_file/ii
  • The stream is passed to ExAws.S3.upload/4, along with bucket name and object key
  • ExAws.request! initialize the multipart upload and uploads the parts

To have an idea of what ExAws is doing, we can enable the debug option in the ex_aws configuration

            # config/config.exs     config :ex_aws,  debug_requests: truthful,  json_codec: Jason,  access_key_id: {:system, "AWS_ACCESS_KEY_ID"},  secret_access_key: {:arrangement, "AWS_SECRET_ACCESS_KEY"}                      

We should see multiple parts being sent at the same time

            17:eleven:24.586 [debug] ExAws: Request URL: "...?partNumber=2&uploadId=..." Endeavor: 1 17:11:24.589 [debug] ExAws: Asking URL: "...?partNumber=1&uploadId=..." Try: one                      

Multipart upload timeout

When the file is big, the upload could take time. To upload the parts concurrently, ExAws uses Elixir Tasks – the default timeout for part's upload is ready to 30 seconds, which could non be plenty with a ho-hum connexion.

            ** (go out) exited in: Task.Supervised.stream(30000)     ** (EXIT) fourth dimension out                      

Nosotros can change the timeout by passing a new :timeout to ExAws.S3.upload/4, 120 seconds in this example.

            ExAws.S3.Upload.stream_file("numbers.txt") |> ExAws.S3.upload(   "poeticoding-aws-elixir", "numbers.txt",    [timeout: 120_000])  |> ExAws.request!()                      

Download a large file

To download a big file it's better to avoid get_object, which holds the whole file content in retentiveness. With ExAws.S3.download_file/4 instead, we can download the data in chunks saving them direct into a file.

            ExAws.S3.download_file(   "poeticoding-aws-elixir",    "numbers.txt", "local_file.txt" )  |> ExAws.asking!()                      

presigned urls and download streams – process a file on the fly

Unfortunately we can't use ExAws.S3.download_file/4 to get a download stream and process the file on the fly.

However, nosotros tin generate a presigned url to get a unique and temporary URL and then download the file with a library like mint or HTTPoison.

            iex> ExAws.Config.new(:s3) \ ...> |> ExAws.S3.presigned_url(:get, "poeticoding-aws-elixir", "numbers.txt")  {:ok, "https://...?10-Amz-Credential=...&X-Amz-Expires=3600"}                      

Past default, the URL expires after ane hour – with the :expires_in pick we can set a unlike expiration time (in seconds).

            iex> ExAws.Config.new(:s3) \ ...> |> ExAws.S3.presigned_url(:go, "poeticoding-aws-elixir",     "numbers.txt", [expires_in: 300]) # 300 seconds  {:ok, "https://...?X-Amz-Credential=...&Ten-Amz-Expires=300"}                      

Now that we have the URL, we can use Elixir Streams to process the information on the fly and calculate the sum of all the lines numbers.txt. In this article you lot find the HTTPStream's code and how it works.

            # generate the presigned URL ExAws.Config.new(:s3)  |> ExAws.S3.presigned_url(:get, "poeticoding-aws-elixir", "numbers.txt")  # returning just the URL cord to the next step |> case do   {:ok, url} -> url end  # using HTTPStream to download the file in chunks # getting a Stream of lines |> HTTPStream.become() |> HTTPStream.lines()  ## converting each line to an integer |> Stream.map(fn line->    case Integer.parse(line) exercise     {num, _} -> num     :error -> 0   end end)  ## sum the numbers |> Enum.sum()  |> IO.inspect(label: "result")                      

In the first two lines we generate a presigned url. And then, with HTTPStream.get we create a stream that lazily downloads the file chunk past chunk, transforming chunks into lines with HTTPStream.lines, mapping the lines into integers and summing all the numbers. The result should exist 12468816.

cookprinkin.blogspot.com

Source: https://www.poeticoding.com/aws-s3-in-elixir-with-exaws/

0 Response to "Aws Sqs Extended Client S3 Multipart Upload"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel