Updated February 2021
No doubt, we’re spoiled for choice with today’s variety of file storages. Cloud services are the most popular on the market thanks to their accessibility and ease of use. As estimated, there exist over 2300 million cloud storage users across the globe. This figure is expected to grow even further.
With its scalable infrastructure and strong security measures, Amazon S3 is a top media library choice of many consumers. Businesses of all sizes and industries shift to this storage after they’ve heard a lot about Amazon's spotless reputation and its strive for perfection.
But Amazon was never looking for any shortcuts. And its object system sets the storage apart from the competition, while also complicating the onboarding process for new users. Still take it easy! With this article, we’re going to tell you about Amazon S3 pitfalls.
As we have recently built DAM integration with Amazon at Pics.io, uploading and managing files on S3 became really important for our users. So here we are, bringing benefit to everyone who needs this info to make working with S3 a piece of cake for you.
Amazon S3: Some important terminology
As a new user to Amazon, you may be puzzled when you first open your account. Where is the traditional file and folder organization? What is a secret key and why are my precious files stored in buckets?
Here is a short list of terms you might want to know before even signing in to your account:
- AWS (Amazon Web Services) Management Console. This is a web-based application through which you will access and manage your cloud storage. You’ll need your user name & password to sign in to your account.
- Root user vs. IAM (Identity and Access Management) user. There are two types of users in AWS: an account owner (or root user) and users granted with certain roles and access privileges (IAM users). A pro tip: for safety and security measures, Amazon recommends reducing the use of root user credentials to a minimum. Instead, you can create an IAM user and grant them full access.
- Access Key ID and Secret Key. Next to Console access (developed mostly for users with a limited technical background), there also exists programmatic access. And here you’ll need AWS access keys to make programmatic calls.
- Bucket. In your Amazon S3 Console, you create buckets - a so-called parent folder for your assets and their metadata. By default, Amazon S3 grants you 100 buckets per account, but you can increase this limit by up to 1000 buckets for extra payment.
Bucket = Object 1 + Object 2 + Object 3
- Object. We store objects in buckets. Composed of files, plus their metadata (optionally), an object can be any kind of file you need to upload: a text file, an image, video, audio, and so on. Your maximum size allowed is 160 GB.
Object = file + metadata (optionally)
- Folders. You can group your objects by folders. But remember that Amazon S3 has a flat file system in contrast to traditional hierarchy where your assets are grouped in directories and subdirectories. Flat structure means that you achieve organizational simplicity with the help of unique file and folder names. For example, you add a project name + client name + due date so you won’t meet the same name across the storage.
- Region. Amazon S3 buckets are region-specific. This means you choose the geographic location where you want the company to store your assets. Remember that objects in the bucket won’t leave their location unless you specifically transfer them to a different region.
- Key names & prefixes. Key names refer to object names, and together with prefixes (a common string in the object names), they help you access the needed file quicker and easier. Let’s say you store photo1 in folder1 in your bucket. Then you can search your files by entering bucket/folder1/photo1 instead of opening folders and buckets manually.
We’ve set up an Amazon S3 account: How to get started with using it
How to create a bucket?
After you’ve signed in to your AWS Console and accessed your S3 user interface, it’s a high time to explore your user account. So, the first thing you do is a bucket creation.
Here you need to indicate your bucket name and geographical location where you want Amazon to store your bucket and its content. We’ve already mentioned flat structure as a special feature of Amazon S3 storage, so be attentive when you choose your bucket name. The system won’t let you go unless the name 1) is unique all across the storage 2) is between 3 and 63 characters; and 3) contains only lowercase characters.
As for the region, the storage allows you to create a bucket in the location you want. And the best idea is to choose the one that is the closest to you. In this way, you won’t only reduce response time but will cut costs and meet regulatory requirements.
What else can I do with my bucket?
In the same menu, you also set permissions and configure options. Depending on the roles in your team, you decide who will create, edit, and delete objects in your bucket.
2) Public vs. individual access
Then choose between public and individual access. For the sake of security, we don’t recommend granting public access unless you use this bucket to share your files with many clients or partners. If this is not the case, you can always make particular files publicly accessible to others.
As for configuration options, enable versioning if you’re planning to store different revisions of the same object. Let’s say you’re designing a new logo for your marketing campaign. Throughout this process, there will be multiple updates to your file when you experiment with the color palette or elaborate on the font.
With versioning, different versions of your file are stored under the same key, and you retrieve them all at once when accessing the object. Amazon S3 users also appreciate versioning when it comes to application failures or unintended actions (for example, when your colleague has deleted the precise revision you all agreed to use).
4) Server vs. object access logging
Tick server access logging in case you want to track requests made in a bucket. Access log reports come in handy to you in times of audits and as a safety precaution.
You can also try out a more advanced object-level logging. On this occasion, you’re free to filter events to be logged, and you track them in CloudTrail - a separate AWS auditing service.
Finally, encrypt your files if you want to additionally secure your data. Encryption stands for encoding your information so it could be accessed only by using a password and specially-designed encryption (decryption) key. Not to go any further into coding, S3 storage enables you to choose the default encryption when you create a new bucket.
Getting inside the bucket
How to upload your files to Amazon S3?
In the bucket, we store our objects (files + metadata) and use folders if we need to group our files. You’ll see that uploading files to S3 storage is as easy as pie. You just press on upload and either drag’n’drop your materials or point-and-click them - use the way which is more convenient for you. Click on create a folder if you’re willing to group your objects in folders.
Other than that, feel free to upload a whole folder. Only the drag’n’drop option is available to you in this case. Still it simplifies the task if you need to upload a broad scope of files and reflect their structure. With folder upload, Amazon S3 mirrors its structure and uploads all the subfolders even though this way could overall be more time-consuming.
What else should I know when uploading objects to S3 storage?
If you don’t mind, I'll repeat myself: Amazon S3 uses object-based storage. This means no filesystem at all! Every file you upload (whatever the origin, type, and format) gets converted to an object and is found in a bucket afterward.
Since there is no filesystem (at least in the common meaning of this word), we won’t speak about names as filenames anymore. (And we’ve already mentioned that S3 is all about unique names - this is how we organize and access files in this storage service.) This is why when you upload a new object, you won’t even have the possibility to choose a name for it.
But to compensate for non-existing filenames, the service uses an object key (or key name) which uniquely defines an object in the bucket.
What are other configuration options during the upload? The same as with buckets, you can also use encryption to secure your data and manage public permissions. Plus, you can make a particular file accessible to a certain user or users.
Choose storage classes based on how often you’re planning to access your data. For example, S3 Standard (the default type) is designed for critical, non-reproducible data you’re going to manage regularly.
Metadata vs. Tagging
Apart from a key (and data), each S3 object has a list of metadata you set when uploading it. In brief, this is additional information about the object like when it was created or by whom. The metadata is stored by using a key-value system where key helps to identify an object, and value is the object itself.
To put it simply, content length or file type are the keys when we’re referring to these kinds of metadata. Accordingly, their values will be the object size in bytes and different file types like PDF, text, video, audio, or any other format you can think about.
Following the same logic, you can add tags to your files that help to search, organize, and manage access to your objects. Tags are the same key-value pairs, and they’re also kinds of metadata, but there is a significant difference between them.
An object in S3 is invariable, the same as its metadata as a part of the object. The AWS Console does allow you “to edit” metadata, but it doesn’t reflect the reality. What actually happens is a new file version being created every time you modify the object.
The situation is different with tags - they’re additional, “subresource” information concerning an object. Since they’re managed separately, you won’t change a file when adding tags to it. Overall, you can choose up to 10 tags per object in S3.
Folders as a means of grouping objects
How do we use folders in S3?
As you’ve probably figured it out, buckets and objects play central roles in S3 storage. But this is not the case with folders. These were only added to compensate for the absent file hierarchical system to improve file management and access.
In Amazon S3, folders help you to find your files thanks to prefixes (located before the key name). Let’s say you create a folder named Images, and there you store an object with the key name images/photo1.jpg. “Images” is the prefix in this case. “/” is the delimiter, automatically added by the system (avoid them in your folder names). The more folders and subfolders you create, the more prefixes your file will get.
And so you can use these prefixes to access your data. Just type one or more prefixes into the S3 search engine to filter your searches.
Actions with folders and objects
What you can do with your files and folders is pretty standard in Amazon S3 storage. You can create new folders, delete them, make public, copy, and move. Change their metadata, encryption, storage class, and tags - but, as expected, no renaming option is available.
Your interaction with objects won’t be very different. With Amazon S3, you’ll have no problem with uploading and copying objects. Plus, you can open your assets, move, download, and delete them (in different formats if needed).
An interesting option includes recovering deleted objects, which can be especially useful in case of system failures. But be aware that “undeleting” objects is possible only in buckets with enabled versioning.
Getting back to upload again: Moving big data to Amazon S3
We’ve made it pretty clear that uploading assets to Amazon S3 should not cause any difficulties. And it is so if we’re speaking about small-scale data. But what if your digital library extends to 1000 files or 10 000, or a million? Can you imagine you drag’n’drop these files or point-and-click them?
What a waste of time it could be! Fortunately, there are other, easier and faster ways to move massive data to your S3 storage…
1) Direct Connect is an excellent solution for transferring large amounts of data. Its idea is to create a direct connection between your on-premise data sources and Amazon’s network. In this way, you bypass any obstacles created by your internet provider and web traffic and move your data quicker and easier.
As usual, you can request a connection in the AWS Console, just choose the region you want to use, set the number of ports, and their speed - and you can apply the solution.
When to use this solution?
- When you need to transfer large-scale data, and your Internet connection is slow.
- When you’re eager to reduce costs and achieve a more consistent network experience.
2) AWS Data Sync is very much like Direct Connect but is more sophisticated, with improved management and automation options. For example, AWS Data Sync allows you to track your transfers, schedule particular processes, adapt speed and bandwidth.
But a more advanced solution also means more complex transfers, doesn’t it? So users with limited knowledge in coding may find Data Sync too difficult.
When to use this solution?
- When you need additional automation for your data transfers to cut costs. For instance, choose Data Sync if you want to filter your migration of data pointing out which folders/files to move first.
- When you work in a large enterprise, your data transfer will be completed under the supervision of developers.
3) Amazon S3 Transfer Acceleration was designed to speed up your data migration processes to S3 storage specifically. The solution works perfectly for data transfers across long physical distances.
When to use this solution?
- When you need a fast transfer and/or for a longer distance.
- When you have to move your data from one bucket only.
4) Amazon Kinesis Firehose is a real-time data migration tool you enable through the AWS Console. The service is good due to its easy-peasy interface - you set up the delivery in a few clicks. Plus, it’s more cost-efficient than other Amazon solutions as you pay not for the service but for the amount of data you transfer.
When to use this solution?
- When you’re looking for a streaming data migration tool.
- When you don’t want to waste your time on administration.
- When you’re planning to cut costs by means of paying as you go.
5) Tsunami UDP is one of a few free of charge solutions available to move big data to Amazon S3. It doesn’t work online yet, so you’ll have to download and install the tool. Plus, the basic knowledge in coding is necessary to work with this solution.
When to use this solution?
- When your budget is limited, but you still need to move large-scale data to your S3 storage.
- When you need to transfer files only (the solution doesn’t support moving folders and subfolders).
- When you don’t own any sensitive data: Tsunami UDP doesn’t encrypt your data.
6) Pics.io Data Migration is a new service delivered by Pics.io DAM. Choose this option if you need to transfer big data (and metadata!) to your AWS storage but don’t want to go into trouble by doing everything on your own.
When to use this solution?
- When you need to migrate your data from one source to another. It could be another cloud storage to Amazon S3. Or moving files between within your buckets.
- When you want to complete migration quickly & easily. In this case, you just contact the Pics.io support team, grant a few permissions, & your DAM solution takes care of the rest.
- When you’re planning to move metadata together with your files. This option allows you to preserve your folder structure, transfer keywords, file descriptions, and so on.
- When you care about the security of your information. Pics.io will complete the upload in the safest way possible.
7) AWS Import/Export Disk is an offline data transfer solution. Here you upload your data to a portable device and ship it to AWS. Then the company moves the data to your storage directly using its high-speed internal network. As a rule, this happens the next business day when Amazon receives your device. As soon as the export/import is completed, the company sends back your external hard drive.
When to use this solution?
- When preparing and mailing your data sets will still take less time than uploading your files in any other way. Approximately, you’d better consider this option when the size of your data is larger than 100 GB.
- When shipping an external hard drive with your data will remain cheaper than upgrading your connectivity (in case you’re planning to move your data online).
8) AWS Snow Family is another offline solution, composed of three different transfer services (AWS Snowcone, AWS Snowball, and AWS Snowmobile). The idea is similar to AWS Import/Export Disk, but this time you use AWS appliances to move your large-scale data.
You order the service online through AWS Console, copy your data to the device, and return it once the upload is completed. The whole process estimatedly takes about a week, shipping and data transfer included. But with this option, you can move from a few terabytes to petabytes of information.
When to use this solution?
- Again, when the waiting time for shipping and transferring data is justifiable as compared to any other upload method.
- Choose between AWS Snowcone, AWS Snowball, and AWS Snowmobile, depending on the volume of your data. AWS Snowcone is the smallest physical storage in the AWS Snow Family. It’s easy and portable, and users order Snowcones for data transfers, the same as in cases of connectivity issues.
- AWS Snowball is for more large-scale migration of data (from 42 TB). And finally, AWS Snowmobile is a whole shipping container. With this service, you get more secure, high-speed data transfer, GPS tracking, video surveillance, etc., etc.
Common issues and solutions
Amazon S3 attracts users with its multiple benefits. The storage is highly durable, has unlimited storage abilities, and unique security opportunities. Although most of the time it’s a sheer delight for you to work with this storage, disruptions still happen.
Here’s how you can solve them:
Problem 1: Your access to the storage is denied.
Solution 1: This means you’re using a wrong access key and/or secret key, or you may simply have no rights to access the storage. Check your credentials as well as the permission policy to your IAM user if this is applicable.
Problem 2: Your specified key doesn’t exist.
Solution 2: You receive this message if there are issues with the naming of your files and buckets. Check the names, remove punctuation and special characters if present.
Problem 3: Your signature doesn’t match.
Solution 3: If you see this error message, it’s likely that you used capital letters and/or spaces in your bucket name - rename the bucket (or you’d better create a new one with proper naming conventions).
Problem 4: Your files don’t upload/download.
Solution 4: Check your internet connection and/or speed. Remove the cache. Make sure you have free space in your storage, and your uploads correspond to Amazon S3 file requirements.
Then you may need to check your host settings: go to Downloads - Settings - Extensions - Amazon S3. Is your host set up correctly? The region? Review the filenames of your files (the number of characters and whether they have any special characters, for example). If you’re using a mobile device, check the size of your photo/video - it should not exceed 2 GB.
Problem 5: When your files grow in number, you’ll notice soon it becomes more difficult to manage them.
Solution 5: Digital Asset Management can enhance your Amazon experience and resolve this issue for you.
Advanced file organization with Pics.io DAM
Move your S3 storage to a whole new level by integrating it into Pics.io DAM. This is an advanced solution for organizing and distributing your files so as to maximize your team’s performance.
Pics.io DAM is a win-win strategy for you as an Amazon S3 user. Since it works on top of your storage, you won’t need to launch any additional software (and migrate your files again), no one will have access to your storage, & no charges for storage space. Still you get:
- Unique file organization. No need to search your storage for hours or cram all those prefixes. Pics.io displays your S3 storage in a more usual and user-friendly way so you can navigate it easily. Plus, it is very visual - you can actually see all the thumbnails, and it saves a lot of time in daily work.
- Access. With DAM, you access your files very easily by keywords, locations, dates, and so on. Or you may use the more advanced search: for example, you can find your files by content with AI-powered search.
- Collaboration with your team. Your colleagues can leave you messages under specific assets or mark the areas they want to discuss. Tag your teammates directly in your storage. And get updates on any changes in the directory.
- Sharing. With Pics.io DAM, you have unique shareable websites where you place your materials and then send the link to your clients/freelancers/partners. Customize these websites: for example, add your domain name or change the color palette to promote your brand.
- Security. Add one more level of security to your storage by changing rights and permissions. For example, you can specifically decide who can upload/download your assets, edit, and delete them. And many other pleasant surprises like linked assets, a file comparison tool, communication center, etc.
Last but not least, if you had tried multiple public cloud providers but still decided to stop on Amazon S3, this obviously happened for a reason. With its scalability, convenience, and security, Amazon S3 is indeed one of the leading storages available on the market today. In turn, our Pics.io team will help you to make the most out of your storage and enhance your Amazon S3 experience.