Mr. Stalin Sabu Thomas is a Machine Learning AI engineer passionate about cutting-edge technology and solving real-world problems. He has extensive experience in programming, web development, and machine learning.
We had a conversation with Mr. Stalin about the various aspects of Machine Learning and Data Science to get better insight from the expert himself! Read what he has to say about the impact, scope, and challenges of Machine Learning and Data Science.
Q: What will be the impact of machine learning on society and the economy in the next decade?
Machine learning has already made significant impacts on society and the economy, and these impacts are expected to continue and accelerate in the next decade
- Automation of low skilled labor: ML has the potential to automate a wide range of jobs, like routine administrative tasks, customer service, etc. This could lead to significant job losses in some industries, but could also free up workers to pursue more creative and fulfilling work. This can also lead to better quality of life as there are some jobs that are considered dangerous or can cause harm over time. Jobs like sewage cleaning, maintenance in factories, construction. Automation and robotics can take these up, freeing us to do more challenging things.
- Improved healthcare: Machine learning algorithms can analyze vast amounts of medical data to help diagnose diseases, predict outbreaks, and develop personalized treatment plans. So much work is happening in the drug discovery field to find treatment for incurable diseases. ML can boost this research and hence increase average life expectancy.
- Enhanced transportation: Machine learning can be used to optimize traffic flow, improve safety, reduce fuel consumption, reduce congestion, etc. Self-driving cars are a great example.
- Increased efficiency in business: Machine learning algorithms can help businesses optimize their operations by predicting customer behavior, identifying areas for cost savings, and streamlining workflows. Thus allowing us to use our resources wisely and reduce loss.
- Improved scientific research: Machine learning can be used to analyze complex data sets in fields. The black hole image captured by Event Horizon Telescope, NASA, is a great example on how human curiosity can be further extended.
The World Wide Web was a great change and it had its positive and negative effects. Same goes for ML. It will be important for policymakers, businesses, and individuals to be aware of these impacts and work to ensure that the benefits of machine learning are distributed fairly and equitably, with full transparency and peer reviews.
As for the economy, the economy is incredibly complex! We decide the economy, the things we buy and sell, the cash flow is what creates a healthy economy and it is constructed with so many complex factors and ML is only a small factor in it. Yes, thanks to ML, personalization and targeted ads work better, encouraging users to buy more, insights in business allows increase in production and distribution effectively. However, these are only tools; and like any tool, it depends on the user on how the tool is used. Even though ML provides insights and encourages us the way we do business, industries must be ready to drop the old way and incorporate these suggestions. There are so many factors that affect the economy that make it difficult to predict, let alone control it. I know ML can make a huge impact on the quality of life, on the economy, sadly I am not an expert.
Q. How will advances in machine learning affect the nature of work and employment?
The very nature of ML is to automate business and provide insights. As mentioned above, ML can be used to automate routine mundane tasks or replace low skilled labor to increase efficiency. Tasks like data entry, assembly line work can be taken up by automation. This leads to significant job loss and a huge shift in the way we work. As much as we would like to blame ML and automation for job loss, people’s job security responsibility must be taken by the industry and the government.
This changes though it feels abrupt, it’s not. It has been developed over time and now in many sectors it’s ready to be used. Industries and management must take effort in reskilling and upskilling employees when automation comes in. When we develop a new system, we don’t immediately replace what we currently have, rather we keep them in parallel for a while and have a gradual transition to the new phase. So, the more I see people say that automation and AI are taking their jobs, what I see is great mismanagement done by us. It’s time we also take up responsibility for these issues as it is typical human behavior to blame it on the tools instead. Companies need to protect their employees as they are the most important resources and governments should ensure that this is done so.
Ensuring easy and affordable access to quality education, reducing the gap between the poor, are not problems that ML can easily solve, but rather our responsibility to solve them and ML can help us achieve that by providing insights. The rise in automation and ML allows human beings to focus on what we do best, working on higher level tasks that require creativity, problem solving, and empathy, thus increasing our quality of life.
Q. Will machine learning lead to greater inequality or help to reduce it?
Once again, ML is only a tool at our disposal. How we use that tool is upon us! Thus the impact of ML on equality is complex and multifaceted. The bias problem in ML is always relevant. ML trains on data, where does this data come from? Humans! And we know Humans are biased. It’s something we cannot help and this is where AI ethics comes into place and more research is conducted on model explainability.
Let’s say we are working on a ML model that predicts loans for applicants. What are the features required? Credit score, dependents, education, employment status, income, etc. Using these features we can predict if the applicant is capable of repaying the load and how much risk is involved. Let’s say that we add another feature, race/cast. Immediately you will see that the model believes that, darker the color of your skin, you are more likely to be defaulting on loans. Why? Because we have a real problem with systemic racism all over the world! Keep adding more features like gender and sexual identity, religion and you will see how ugly things get.
ML only shows a mirror in front of us and this can be a great lesson to us. So when we say how ML reduces the gap, it’s difficult as all it does is learn what we do! It is our job to ensure that they don’t make the same mistakes we have done and continue to do better! A true unbiased system is difficult to achieve, but that is what we must strive for. Literally a parameter named ‘bias’ is added in ML! The increase in personalized services like targeted ads, recommendations could only further inequality as people who can afford healthcare and education are likely to have better outcomes.
Recommendations in terms of content will only be worse as we are targeted to very specific perspectives and fail to see the other perspectives, thus making us more biased and increasing inequality among us. Although I have made this sound incredibly pessimistic, ML has in fact given more easy and free access to education, improved healthcare and generally increased our quality of life. However the problem of solving inequality cannot be relied upon as a tool but rather on the worker who uses it. It will be important for stakeholders, policymakers, businesses, and individuals to be aware of the impacts of ML and work to ensure that the benefits of machine learning are distributed fairly, transparently, and equitably.
Q. How can we ensure that machine learning systems are fair, transparent, and accountable?
Peer to Peer Review!
- Ensure you do proper feature selection from your data. Information that can lead to bias on certain populations or any personal information must be dropped.
- Consider the sources of your data and fact check them
- Make the ML model algorithm transparent and well documented. ML models must also be explainable, meaning the reasoning behind a certain decision should be understood by humans. This is one of the reasons why algorithms like decision trees are still prevalent today and simply throwing neurons into a neural network doesn’t necessarily solve the problem. This is particularly important for applications such as healthcare, finance, and criminal justice, where decisions can have significant consequences.
- Use techniques such as adversarial training, where the algorithm is trained on adversarial examples that are designed to expose potential biases. ML systems must be designed to detect and mitigate bias in data as well as itself.
- Human oversight and peer to peer reviews and inter-transparency are incredibly important to not just ensuring ML systems are transparent and just, but further improve them as well.
- Continuous monitoring and evaluation are required on deployed models. Detecting data drifts, and ensuring they aren’t having unintended consequences. This involves continuous testing, audits, and user feedback.
Organizations should develop ethical frameworks that guide the development and deployment of machine learning systems. These frameworks should take into account ethical principles such as fairness, accountability, and transparency.
Q. How can machine learning be used to address global challenges such as climate change, poverty, and disease?
I am going to let ChatGPT take this...
Machine learning has the potential to be a powerful tool for addressing global challenges such as climate change, poverty, and disease. Here are some ways in which machine learning can be used:
Climate change: Machine learning can be used to analyze and predict weather patterns, monitor the impact of climate change on ecosystems, and optimize energy consumption. For example, machine learning algorithms can be used to optimize the placement of wind turbines or solar panels in order to maximize energy output.
Poverty: Machine learning can be used to help identify and address poverty. For example, machine learning algorithms can be used to analyze satellite imagery to estimate poverty levels in different regions, which can inform development policies and programs.
Disease: Machine learning can be used to improve disease diagnosis and treatment. For example, machine learning algorithms can be used to analyze medical images or genetic data to identify patterns or biomarkers that can inform treatment decisions.
Disaster response: Machine learning can be used to help predict and respond to natural disasters. For example, machine learning algorithms can be used to analyze satellite imagery to identify areas at risk of flooding or to predict the path of a hurricane.
Education: Machine learning can be used to improve education outcomes. For example, machine learning algorithms can be used to personalize learning experiences based on individual student needs and preferences.
Energy: Machine learning can be used to optimize energy usage, identify energy waste, and develop more efficient energy systems. For example, machine learning algorithms can be used to predict energy demand and optimize energy supply, identify the most effective energy conservation strategies, and analyze energy usage patterns to develop more efficient buildings.
Agriculture: Machine learning can be used to improve crop yields, optimize irrigation systems, and predict weather patterns. For example, machine learning algorithms can be used to identify the most effective crop management strategies, predict which crops are most likely to succeed in a particular location, and optimize the use of fertilizers and other inputs.
Transportation: Machine learning can be used to optimize transportation systems, reduce congestion, and improve safety. For example, machine learning algorithms can be used to predict traffic patterns, optimize routes for public transportation, and identify the most effective strategies for reducing accidents.
Natural resource management: Machine learning can be used to analyze natural resource data and develop more effective strategies for managing resources such as water, forests, and fisheries. For example, machine learning algorithms can be used to predict water availability, optimize forest management practices, and identify the most effective strategies for managing fisheries.
Public health: Machine learning can be used to track the spread of diseases, identify outbreaks, and develop effective prevention and treatment strategies. For example, machine learning algorithms can be used to analyze data from social media and other sources to track the spread of diseases, predict which populations are most at risk, and identify the most effective prevention and treatment strategies.
ChatGPT can go on and on. However, what we must always remember is that ML is only a tool; one of the tools amongst many at our disposal which helps solve this problem. Only focusing on ML but not giving adequate importance to the rest does nothing. ML is not a silver bullet!
Q. What ethical considerations need to be taken into account in the development and deployment of machine learning systems?
Ensuring fairness, eliminating bias from data, ensuring privacy of users, transparency among peers, stakeholders, policymakers and the general population, human oversight on model decisions (especially if the decisions carry a lot of weight), looking into social and environmental impacts of ML (it is a power hungry process), organizations that uphold AI ethics and ensure that the industry upholds them. Pretty much all of this can be summed up to, “doing the right thing!”
Q. What new applications of machine learning are likely to emerge in the coming years and how will they change our lives?
New breaking technologies like the web and now ML usually follow a sigmoid curve pattern. You have slow take off; then an explosion of growth and adoption; competition, improvement in support; and then the dust settles. In the sigmoid curve we are at the explosive growth and high adoption phase.
We will be further improving the technology and making existing problems better and better like personalized healthcare treatments; autonomous transportation; smart homes and cities; manufacturing maintenance and quality assurances; natural language processing: chatbots, virtual assistants, etc.; and much more.
We can expect much more breakthrough changes like drug discovery, AI optimized manufacturing allowing precision in the atomic scale, used in reducing carbon footprint. There will be much more focus on reinforcement learning ML algorithms giving rise to new non-human pattern recognition. Accelerating cyber security systems and vulnerability detection and defense. Fast Simulations in terms of photorealistic environments, fluid mechanics, physics, etc. The scope is endless.
Q. How can machine learning be made more accessible and inclusive to people from diverse backgrounds and industries?
Stop focusing only on ML! Every industry is important and for a country to sustain, there must be a balance and equal importance must be given to every sector! Only focusing on IT ML doesn’t do any good as these are tools and meant to be used everywhere. ML Engineers or data scientists cannot know everything. Programmers are good at one thing and that is converting logic to code! We try to give solutions to problems, but to give a solution one must understand the problem. Who better to ask than the SME’s of each industry themselves.
I keep seeing this trend where governments, schools, parents, give way too much importance to IT and ML, the reason being there is job security. Job security is one of the factors but not the primary one! The primary factor must always be passion. Also, job security, isn’t that the government and industries responsibility? I have friends who study architecture and they have incredible passion for them but the job market is terrible as not enough importance is given to this field. There are so many fields that tend to get overshadowed by the next big thing. What we fail to understand is that for an economy to sustain and to get better quality of living we need to give importance to all fields and accept the impact it creates. Putting all your eggs in 1 basket is a terrible idea. You need creative industries, agricultural industries, construction, finance, electronics, teaching, healthcare, defense, etc., are all important! No job is beneath any other and every industry has their own challenges. Focusing only on IT and ML will only take you so far. Improve other industries and since IT and ML is a multi-domain field, it is very easy to get into this space as well! The more you put money on IT industries and let other industries starve, the stigma of adopting automation and ML into these industries increases! This wrong attitude that relates success to purely just cracking entrance exams and getting an IT job is incredibly flawed, and extremely inefficient. The hype in ML is a slow poison and it is high time we realize that every industry has to be treated with utmost care and respect! Adoption of IT and ML in different industries is not that difficult as long as the industries exist and people work. Once again ML is a tool, you need people to use this tool well!
Furthermore to make these tools more accessible, we need to make education and training more accessible, more adoption of open source tools to make them standardized and have a wide peer review, most important would be community building as this is the core to any technology adoption.
Q. What role will human creativity and intuition play in the future of machine learning?
What do you mean future? ML exists thanks to the role of human creativity and intuition! No machine can ever replace that. You are probably asking due to the rise of generative models like ChatGPT, Stable Diffusion, etc. None of these tools are a replacement for human creativity and intuition but rather tools that allow us to quickly iterate and work on. Example, in the indie gaming industry, developers might not have the resources to hire artists to build assets for their games or environments or sounds, few of the most critical aspects of a game. Using these generative tools, these indie developers might be able to build something with very less cost. But are they replacing the actual artists? Of course not, as the end product is never quite unique and can never be exactly as we expect it to be!
ML algorithms are capable of analyzing vast amounts of data and detecting patterns that might be impossible for humans to discern, they still rely on humans to guide and interpret their output, select the right features, sourcing data, creating and validating hypothesis on the data, understanding the problem statement, and exploring new ML algorithms. Finally at the end of the day, human beings are needed to effectively make use of these tools and take the right action based on the ML algorithms recommendations.
Don’t believe me, this is what ChatGPT says:
Overall, while machine learning is likely to play an increasingly important role in many areas of our lives, human creativity and intuition will continue to play a critical role in the development and deployment of this technology. By combining the strengths of both humans and machines, we can develop more innovative and impactful solutions that address complex problems and benefit society as a whole.
Q. How do you ensure the ethical use of data in your organization?
I think we have already covered this. We have data governance in place, the data we are working on are not personal data but company owned, we maintain transparency in our models and data, we have regular monitoring and collaboration with the stakeholders and users.
Q. How do you prioritize and allocate resources for data science projects within the company?
I am not in charge of that. Typically you need to set a business objective and plan on that. What is the problem we are hoping to solve and what impact are we expecting? Assess the feasibility of the project and the availability of resources. Continuously monitor and communicate progress, we are following the agile methodology for this.
Q. How do you keep up with the latest developments and advancements in the field of data science?
- Attend online conferences and events
- I personally follow people/channels like 3blue1brown, Computerphile, Sebastian Schuchmann, Two Minute Papers, etc.
- Online courses to upskill myself
- Reddit news and discord channels
- Follow industry blogs and publications like the summer, paperspace, distill
Q. How do you work with other departments, such as marketing and product development, to drive data-informed decisions?
They tend to be stakeholders of the project and validates the results of the model and gives direction and guidance.
Q. Can you talk about any challenges your team has faced and how you overcame them?
The after covid effect of working remotely was a challenge, not due to everyone working at home, the problem was that everyone was in very different parts of the world, that the time difference was definitely a challenge. We had very little overlap with the different teams and we had to ensure we spent that time wisely! Our project lead was from the US, so getting guidance from him was particularly challenging as we had a very small window of communication in which he had to cover everyone’s status and clear everyone’s backlogs or doubts. It was definitely difficult for him, but we persevered!
In the technology side of things, when was there not an issue? There were definitely a lot of challenges that we faced, ideas that didn’t quite scale or work out as we thought it would, constant refactoring as our requirements kept changing, network issues, scaling issues, administrative issues, billing issues, you name it! It was an amazing experience going through all these different hoops, getting in calls with different teams; we even deployed things and fixed bugs midway during our demos to stakeholders and prayed that nothing goes wrong in between. We were fearless that way, and as an external consultant thanks to this there was always a feeling of home which made things better.
Q. How do you measure the success of data science initiatives within the company?
Measuring the success of data science initiatives within a company can be challenging as we like to focus on metrics that can be used to evaluate it, but it’s not that easy.
Deciding on business metrics beforehand can make things easier. Metrics like revenue/profit, customer satisfaction, cost savings, return of investments are a few, but they can be tricky as we need to ensure that the ML recommendations are indeed executed and recorded as well.
We have an abundance of model metrics like r2 score, precision, F1 that describe how good a model is, but business doesn’t get too much intuition from that but they are used to build confidence.
User feedback is incredibly important and there must be a proper feedback loop in place
Time for production. How long does the solution take from gathering and ingesting data to making it production ready for users to use!
User usage analytics is a great way to see how much improvement has been made.
ML solves very different problems, so you cannot have anything general in nature. Typically the stakeholders involved create the acceptance criteria in which we need to work on.
Q. Can you discuss any initiatives your company has taken to increase diversity and inclusion in the data science field?
I am not in a position to make those decisions nor have I looked upon the same. But I believe it is the same thing that one must take in hiring any candidate. Discriminate on skill only!
We are deeply grateful to Mr Stalin for taking out time to answer our questions. We hope these have helped you get more clarity around the topic.