Mountains of objects

Mountains of objects

Mixing my old and my new job, the first thing that came to my mind is the object storage services, the paradigm to store big amounts of unstructured data, in a distributed and redundant fashion, with global access.

Like in many technologies nowadays, you can build this type of service yourself, this product from NetApp any company can setup its own object storage service, or you can get it from the cloud. In my current company we have Google Cloud Storage as a object storage service in the cloud … I couldn’t resist to try it first 🙂

I’m not going to deep into the different offerings, prices and the rest of “ugly” details, but I have to say that it is worth it to do the math using the calculator, and check that the reference to store certain type of data is way different to what be are used.

Coming back to the technology, for what are usefull those objects stored in the cloud?

The list is quite long, any information that could be stored in a file and me could access via HTTPS protocol, and the extensions to “push” and “pull” files is a candidate.

From the web page content, like this picture where I’m riding my mountain bike, or the videos from your last vacations, or at a more enterprise level, all these scanned documents that we generate, the pictures taken by mobile apps of field people, the backups of your servers, and a long etcetera.

I think the flexibility of the use cases is key, and what we need is to have the architects and developers using this new model for their non structure data … if we keep storing this massive amount of data in traditional file servers it is difficult to keep with the pace of growth and performance that many organizations need … and we will waste some money in the way.

There is an easy use case to start moving your data to the cloud: outsourcing the backup data. Whether is using your current backup software, like Veritas, Veeam or Commvault, on their latest versions they all support this capability, or with dedicated solutions like NetApp’s AltaVault, we can make backup copies to the cloud without changing the current backup software or processes.

Then the point is to compare prices and service levels. How long will it take to retrieve the data from a tape stored 20Kms away from my data center?

There are many other use cases, the possibilities are incredible, because once the data is in the cloud there many things that we can do with them … for example last week one of my colleagues showed me how to collect a user feedback form data, translate the feedback to a common language and analyze the text to find out the satisfaction level of the customers.

For the moment I will stick to the object part … creating a container, in this case using GCP, is really simple:

Yo click on “create bucket”, then choose a name, type of service and the region where the data will be stored.

As you can see you don’t need to worry on the number of nodes, nor the number of objects, their size, or the replication of this data among different data centers (Multi-regional) … this is for me the key point of these managed services, you just use them, without the need to think in the details underneath.

In the next post we will see how to upload data to one of these repositories.

Regards,

Jamarmu

=====================================================

I work for Google Cloud, but this post are personal ideas and opinions

Leave a Reply