Snapshots Vs Replication Vs Backups

September 11, 2021

Have you been told that oh you don't need any backups or replication because snapshots will fulfil all your data protection needs?

So in this article, we will share with you what's the difference between snapshots, replications and backups. And just maybe you may want to reconsider your DR strategy. Snapshot is all the backup you ever need is often something only storage vendors will tell you. More often than not these are specific vendors that don't have a complete set of data protection strategies or solutions.

Don't get us wrong. We are not saying that snapshots are bad. Snapshots have their place in the chain of data protection but it is certainly not backup neither does it replace the use case of replication. Having said that your data protection needs to differ from the next person so there may be a possibility that snapshots are all that you need but oftentimes most enterprises will have a combination of the three data protection capabilities and technologies.

So let's go through the three data protection methods, a little of how it works and what use case fits best. Snapshots are also known as the point in time copies. Point in Time copies by definition is the viewpoint of the data at the point where the snapshot is triggered. Snapshots are by far the fastest and most efficient data protection method to protect data. Sometimes in certain systems, it is almost instantaneous.

So let's look at how it works. You would have the master copy and as you write more and more data, and when you initiate a snapshot what tends to happen is they just put like a little bookmark or marker. Every time you write subsequent new data on it, there's a journal happening that tracks all these changes. When more new data is added and the next time you trigger a snapshot another journal happens. The longer you take the snapshots, the journal becomes larger and larger and will impact performance.

So why do we not like snapshots or rather why snapshots are not backup is because there is interdependency between all these snapshots and what you want to recover. Say for example you want to recover a point in time data, it is actually a combination of other snapshots and the master copy. So assuming any of the components is corrupted or destroyed, you literally don't have anything else to recover. Say for example if the master copy is dead or corrupt or whatever it may be, you are not able to restore any of these two other copies as well. Also a lot of times snapshots in most storage subsystems live on the same storage. You would have your master here and all your snapshots on top of it.

This is actually not the best practice in general because failure in the volume or the storage simply means all your backups with it will fail. It's a bit like all eggs in the same basket. Having said that snapshots because it's so fast and so powerful and it's just doing reference and pointers it is great for recovery purposes. Snapshots are great if you only need to recover and retain backups for just a couple of days and also it's highly dependent on how often do you take it. This is because the longer you keep snapshots the more resources it uses to keep the journals and all the blocks that are changed.

Many vendors have unique implementations to help alleviate this issue but still really is just delaying the inevitable. The limitations still exist. Now let's look at replication. As the name suggests replication simply means copying or replicating data to another storage. It can be on another system in the same data center but often it is remote to protect against DC failures as well. There are generally two types of replications - asynchronous and synchronous replications. Let's start with async. Async replications often mean that data is replicated at a given interval, perhaps every five minutes, changes are then replicated to the remote site so in the event of a disaster the worst that can happen, you will lose up to five minutes of data and it's often articulated as what we call recovery point objective or RPO equals five minutes.

Sync replications on the other hand replicate all IO as it is written to the storage system. It will commit both local and remote writes before and acknowledging to the host that the write is good. In many cases, mission-critical apps that cannot tolerate any loss of data would often opt for sync replications. Similarly in terms of RPO sync replications is what we term RPO equals zero, which simply means no data loss.

So why would anybody pick Async replications then? As you can tell sync replications demand on bandwidth will be extremely high and latency-sensitive comparatively, async often time have generous allowances of bandwidth and latency making it significantly cheaper. The advantage of replication is in its ability to recover very quickly with minimal data loss in the event of a complete data center failure or primary storage is completely lost. You often time have already a copy of the data and you're ready to resume business. Having said that it is not without its caveats because every data block written is replicated, that simply means if you have a corrupted block that is written or somebody accidentally or maliciously deleted a whole bunch of data. All this will also be replicated like the saying Dirty block in, Dirty block out! This makes it great for business continuity protections and insulation against primary storage failures but surely not so great if you want the ability to roll back to any point in time which brings me to my very last item.

Backups have been around pretty much since the beginning of time. Over time it's evolved to resemble a little like a combination of snapshots and replication. You make a full copy of the primary data every time you run the back up. Which for most organizations it's once a day and assuming you run at 8:00 p.m. you get a point in time replicas of the data exactly as how it looks like at the end. Similar to how a snapshot will be, assuming you do it seven days a week you will now have seven independent copies of data for 8:00 p.m. for the last seven days. Assuming the third copy is corrupted you often still have the second or fourth copy to recover, unlike snapshots or replications. Backups are also perfect for long-term retention because as long as there's capacity and resources you can store it for as long as you want.

You may be thinking the consumption of storage for backup then be massive and surely that's an issue. Yes of course, but there are many capabilities out there like dedupe and compression that will help with that problem and I mean today I will not go into depth with regards to that but the biggest issue with backups is generally time. It takes the longest to protect and also takes the longest to recover without going into details on the advanced backup and recovery capabilities, incremental forever and dedupe appliances which have improved recovery performance over the years. Regardless it is still the slowest of the three technologies we spoke about today. So depending on your needs and requirement you may only need one of the three data protection methods or a combination of all three.

If I will summarize my recommendations for the most cost-effective and fundamental form of data protection for every enterprise. Backup is a must! I cannot stress enough about backups. You need to have backups. For short term data protection between three to five days snapshots is the way to go but optionally I will still recommend backups. And for fast recovery, snapshots and replications is the way to go. Mission-critical applications definitely will be requiring replication with backups. Hopefully that has been useful. I know it seems like a lot for people that are new in the data protection domain and they may all sound the same in some sense. There are subtle differences between all of them.

Latest Blogs
This is a decorative image for Project Management for AI-ML-DL Projects
June 29, 2022

Project Management for AI-ML-DL Projects

Managing a project properly is one of the factors behind its completion and subsequent success. The same can be said for any artificial intelligence (AI)/machine learning (ML)/deep learning (DL) project. Moreover, efficient management in this segment holds even more prominence as it requires continuous testing before delivering the final product.

An efficient project manager will ensure that there is ample time from the concept to the final product so that a client’s requirements are met without any delays and issues.

How is Project Management Done For AI, ML or DL Projects?

As already established, efficient project management is of great importance in AI/ML/DL projects. So, if you are planning to move into this field as a professional, here are some tips –

  • Identifying the problem-

The first step toward managing an AI project is the identification of the problem. What are we trying to solve or what outcome do we desire? AI is a means to receive the outcome that we desire. Multiple solutions are chosen on which AI solutions are built.

  • Testing whether the solution matches the problem-

After the problem has been identified, then testing the solution is done. We try to find out whether we have chosen the right solution for the problem. At this stage, we can ideally understand how to begin with an artificial intelligence or machine learning or deep learning project. We also need to understand whether customers will pay for this solution to the problem.

AI and ML engineers test this problem-solution fit through various techniques such as the traditional lean approach or the product design sprint. These techniques help us by analysing the solution within the deadline easily.

  • Preparing the data and managing it-

If you have a stable customer base for your AI, ML or DL solutions, then begin the project by collecting data and managing it. We begin by segregating the available data into unstructured and structured forms. It is easy to do the division of data in small and medium companies. It is because the amount of data is less. However, other players who own big businesses have large amounts of data to work on. Data engineers use all the tools and techniques to organise and clean up the data.

  • Choosing the algorithm for the problem-

To keep the blog simple, we will try not to mention the technical side of AI algorithms in the content here. There are different types of algorithms which depend on the type of machine learning technique we employ. If it is the supervised learning model, then the classification helps us in labelling the project and the regression helps us predict the quantity. A data engineer can choose from any of the popular algorithms like the Naïve Bayes classification or the random forest algorithm. If the unsupervised learning model is used, then clustering algorithms are used.

  • Training the algorithm-

For training algorithms, one needs to use various AI techniques, which are done through software developed by programmers. While most of the job is done in Python, nowadays, JavaScript, Java, C++ and Julia are also used. So, a developmental team is set up at this stage. These developers make a minimum threshold that is able to generate the necessary statistics to train the algorithm.  

  • Deployment of the project-

After the project is completed, then we come to its deployment. It can either be deployed on a local server or the Cloud. So, data engineers see if the local GPU or the Cloud GPU are in order. And, then they deploy the code along with the required dashboard to view the analytics.

Final Words-

To sum it up, this is a generic overview of how a project management system should work for AI/ML/DL projects. However, a point to keep in mind here is that this is not a universal process. The particulars will alter according to a specific project. 

Reference Links:

https://www.datacamp.com/blog/how-to-manage-ai-projects-effectively

https://appinventiv.com/blog/ai-project-management/#:~:text=There%20are%20six%20steps%20that,product%20on%20the%20right%20platform.

https://www.datascience-pm.com/manage-ai-projects/

https://community.pmi.org/blog-post/70065/how-can-i-manage-complex-ai-projects-#_=_

This is a decorative image for Top 7 AI & ML start-ups in Telecom Industry in India
June 29, 2022

Top 7 AI & ML start-ups in Telecom Industry in India

With the multiple technological advancements witnessed by India as a country in the last few years, deep learning, machine learning and artificial intelligence have come across as futuristic technologies that will lead to the improved management of data hungry workloads.

 

The availability of artificial intelligence and machine learning in almost all industries today, including the telecom industry in India, has helped change the way of operational management for many existing businesses and startups that are the exclusive service providers in India.

 

In addition to that, the awareness and popularity of cloud GPU servers or other GPU cloud computing mediums have encouraged AI and ML startups in the telecom industry in India to take up their efficiency a notch higher by combining these technologies with cloud computing GPU. Let us look into the 7 AI and ML startups in the telecom industry in India 2022 below.

 

Top AI and ML Startups in Telecom Industry 

With 5G being the top priority for the majority of companies in the telecom industry in India, the importance of providing network affordability for everyone around the country has become the sole mission. Technologies like artificial intelligence and machine learning are the key digital transformation techniques that can change the way networks rotates in the country. The top startups include the following:

Wiom

Founded in 2021, Wiom is a telecom startup using various technologies like deep learning and artificial intelligence to create a blockchain-based working model for internet delivery. It is an affordable scalable model that might incorporate GPU cloud servers in the future when data flow increases. 

TechVantage

As one of the companies that are strongly driven by data and unique state-of-the-art solutions for revenue generation and cost optimization, TechVantage is a startup in the telecom industry that betters the user experiences for leading telecom heroes with improved media generation and reach, using GPU cloud online

Manthan

As one of the strongest performers is the customer analytics solutions, Manthan is a supporting startup in India in the telecom industry. It is an almost business assistant that can help with leveraging deep analytics for improved efficiency. For denser database management, NVIDIA A100 80 GB is one of their top choices. 

NetraDyne

Just as NVIDIA is known as a top GPU cloud provider, NetraDyne can be named as a telecom startup, even if not directly. It aims to use artificial intelligence and machine learning to increase road safety which is also a key concern for the telecom providers, for their field team. It assists with fleet management. 

KeyPoint Tech

This AI- and ML-driven startup is all set to combine various technologies to provide improved technology solutions for all devices and platforms. At present, they do not use any available cloud GPU servers but expect to experiment with GPU cloud computing in the future when data inflow increases.

 

Helpshift

Actively known to resolve customer communication, it is also considered to be a startup in the telecom industry as it facilitates better communication among customers for increased engagement and satisfaction. 

Facilio

An AI startup in Chennai, Facilio is a facility operation and maintenance solution that aims to improve the machine efficiency needed for network tower management, buildings, machines, etc.

 

In conclusion, the telecom industry in India is actively looking to improve the services provided to customers to ensure maximum customer satisfaction. From top-class networking solutions to better management of increasing databases using GPU cloud or other GPU online services to manage data hungry workloads efficiently, AI and MI-enabled solutions have taken the telecom industry by storm. Moreover, with the introduction of artificial intelligence and machine learning in this industry, the scope of innovation and improvement is higher than ever before.

 

 

References

https://www.inventiva.co.in/trends/telecom-startup-funding-inr-30-crore/

https://www.mygreatlearning.com/blog/top-ai-startups-in-india/

This is a decorative image for Top 7 AI Startups in Education Industry
June 29, 2022

Top 7 AI Startups in Education Industry

The evolution of the global education system is an interesting thing to watch. The way this whole sector has transformed in the past decade can make a great case study on how modern technology like artificial intelligence (AI) makes a tangible difference in human life. 

In this evolution, edtech startups have played a pivotal role. And, in this write-up, you will get a chance to learn about some of them. So, read on to explore more.

Top AI Startups in the Education Industry-

Following is a list of education startups that are making a difference in the way this sector is transforming –

  1. Miko

Miko started its operations in 2015 in Mumbai, Maharashtra. Miko has made a companion for children. This companion is a bot which is powered by AI technology. The bot is able to perform an array of functions like talking, responding, educating, providing entertainment, and also understanding a child’s requirements. Additionally, the bot can answer what the child asks. It can also carry out a guided discussion for clarifying any topic to the child. Miko bots are integrated with a companion app which allows parents to control them through their Android and iOS devices. 

  1. iNurture

iNurture was founded in 2005 in Bengaluru, Karnataka. It provides universities assistance with job-oriented UG and PG courses. It offers courses in IT, innovation, marketing leadership, business analytics, financial services, design and new media, and design. One of its popular products is KRACKiN. It is an AI-powered platform which engages students and provides employment with career guidance. 

  1. Verzeo

Verzeo started its operations in 2018 in Bengaluru, Karnataka. It is a platform based on AI and ML. It provides academic programmes involving multi-disciplinary learning that can later culminate in getting an internship. These programmes are in subjects like artificial intelligence, machine learning, digital marketing and robotics.

  1. EnglishEdge 

EnglishEdge was founded in Noida in 2012. EnglishEdge provides courses driven by AI for getting skilled in English. There are several programmes to polish your English skills through courses provided online like professional edge, conversation edge, grammar edge and professional edge. There is also a portable lab for schools using smart classes for teaching the language. 

  1. CollPoll

CollPoll was founded in 2013 in Bengaluru, Karnataka. The platform is mobile- and web-based. CollPoll helps in managing educational institutions. It helps in the management of admission, curriculum, timetable, placement, fees and other features. College or university administrators, faculty and students can share opinions, ideas and information on a central server from their Android and iOS phones.

  1. Thinkster

Thinkster was founded in 2010 in Bengaluru, Karnataka. Thinkster is a program for learning mathematics and it is based on AI. The program is specifically focused on teaching mathematics to K-12 students. Students get a personalised experience as classes are conducted in a one-on-one session with the tutors of mathematics. Teachers can give scores for daily worksheets along with personalised comments for the improvement of students. The platform uses AI to analyse students’ performance. You can access the app through Android and iOS devices.

  1. ByteLearn 

ByteLearn was founded in Noida in 2020. ByteLean is an assistant driven by artificial intelligence which helps mathematics teachers and other coaches to tutor students on its platform. It provides students attention in one-on-one sessions. ByteLearn also helps students with personalised practice sessions.

Key Highlights

  • High demand for AI-powered personalised education, adaptive learning and task automation is steering the market.
  • Several AI segments such as speech and image recognition, machine learning algorithms and natural language processing can radically enhance the learning system with automatic performance assessment, 24x7 tutoring and support and personalised lessons.
  • As per the market reports of P&S Intelligence, the worldwide AI in the education industry has a valuation of $1.1 billion as of 2019.
  • In 2030, it is projected to attain $25.7 billion, indicating a 32.9% CAGR from 2020 to 2030.

Bottom Line

Rising reliability on smart devices, huge spending on AI technologies and edtech and highly developed learning infrastructure are the primary contributors to the growth education sector has witnessed recently. Notably, artificial intelligence in the education sector will expand drastically. However, certain unmapped areas require innovations.

With experienced well-coordinated teams and engaging ideas, AI education startups can achieve great success.

Reference Links:

https://belitsoft.com/custom-elearning-development/ai-in-education/ai-in-edtech

https://www.emergenresearch.com/blog/top-10-leading-companies-in-the-artificial-intelligence-in-education-sector-market

https://xenoss.io/blog/ai-edtech-startups

https://riiid.com/en/about

Build on the most powerful infrastructure cloud

A vector illustration of a tech city using latest cloud technologies & infrastructure