The famous GPU-accelerated BlazingSQL engine that is a part of the RAPIDS ecosystem has now become entirely open-source. It is licensed under Apache 2.0 by Apache Software Foundation.
Earlier, even though BlazingSQL is not primarily just a database, it was commonly known by the name of Blazing DB. But now, to emphasize the fact that it is majorly an SQL engine used to process different kinds of data and not just store the data, it is renamed as BlazingSQL. BlazingSQL has benefited most from the RAPIDS ecosystem. Thanks to the massive popularity, more than 100 developers contribute to BlazingSQL regularly. On top of it, developers are highly compatible with the mission of BlazingSQL and are, most of the time, a part of the enterprise. Due to this, they have understood the needs precisely and have continuously enhanced the platform with a wide variety of features like file format support.
There has been a massive increase in the adoption of RAPIDS. Due to this, BlazingSQL has also got a humongous chunk of users to observe and reiterate upon. With the open-source nature, the BlazingSQL team can accelerate the development cycle significantly. With NVIDIA adamant about its goal of building the next big thing in data centers, RAPIDS is all set to make the best use of these advancements. With these plans in mind, BlazingSQL’s open-source platform is on its path to becoming the standard GPU SQL engine.
With a helping hand from Dask in the form of enhanced integrations and support from the Apache Arrow on GPUs, the open-source BlazingSQL is becoming a winner in creating an element of interoperability in the world of GPU accelerated databases. BlazingSQL has been associated with RAPIDS even before it was well-known in the technical world. But now, with NVIDIA helping RAPIDS to enhance the customer experience at massive levels, BlazingSQL is focusing on easy integrations so that customers can deploy it quickly.
What’s in there for the users?
Efficient
After being there for some time in the industry, BlazingSQL has solved some critical issues for their target audience. Earlier, customers used an amalgam of thousands of servers to make calculations and do processes at massive scales. With BlazingSQL, a similar scale can be achieved by just a small fraction of this amalgam.
Fast
Earlier, specific processes used to take hours and sometimes even days to perform some intensive tasks. This caused a significant lag for the customers as they had to wait for a lot of time to reiterate the processed information. With BlazingSQL, the customer can access the RAPIDS’ accelerated GPUs to achieve the same tasks within minutes, making the reiterations quicker.
Scalability
In most cases, the workloads were built on a very small scale and then were renovated for larger systems to process complex tasks. BlazingSQL and RAPIDS’s beautiful integration gives the customer complete freedom to change the distribution scale within minutes by writing a few code lines. Queries are performed at lightning speed with BlazingSQL. Customers can fetch the raw data via the data stack and RAPIDS using just a few code lines.
Where is the future headed?
We all know that RAPIDS is slowly becoming one of the best analytics ecosystems. Topped with the huge popularity of SQL, BlazingSQL has become the first choice for RAPIDS users or almost every user in their target segment.
BlazingSQL is known to be contributing continuously to cuDF within the RAPIDS ecosystem. It is developed with cuDF and cuIO under the hood. Any new changes or improvements in cuDF and cuIO also affects the working of BlazingSQL directly because it is entirely interoperable with RAPIDS, and it depends on GDFs for the majority part.
Right from the beginning, BlazingSQL has been and is directed towards reducing the complexity of code that needs to be written for simple to complex operations. A single SQL command can act as powerful as hundreds of commands of cuDF. Additionally, BlazingSQL is also removing the need to sync different databases again and again. As mentioned earlier, it can query the raw files directly from the storage. All in all, BlazingSQL has been instrumental in enhancing the accessibility and speed of RAPIDS by providing SQL support.
If you want to give it a try, you can opt for free GPUs. BlazingSQL provides an excellent platform called BlazingSQL Notebooks that helps the customer get started with it on free GPUs. Alternatively, you may also want to try BlazingSQL on your system. For solving this purpose, you can opt for the Dockerhub container. If you want to customize the experience according to your needs, you can restructure and deploy the source code accordingly.