Distributed Systems (Part -2)

  • In the previous article, we had a high-level overview of distributed systems.
  • We found that why data is so important.
  • We learned that Big data is a large amount of diversified information that is arriving in ever-increasing volumes and at ever-increasing speeds.
  • Big data can be structured (typically numerical, readily formatted, and saved) or unstructured (often non-numerical, difficult to format and store) (more free-form, less quantifiable).
  • Big data analysis may benefit nearly every function in a company, but dealing with the clutter and noise can be difficult.
  • Big data can be gathered willingly through personal devices and apps, through questionnaires, product purchases, and electronic check-ins, as well as publicly published remarks on social networks and websites.
  • Big data is frequently kept in computer databases and examined with software intended to deal with huge, complicated data sets.
  • Lastly, we learned about the challenges with Big data.
  • In this article, we will learn about how does big data works, the uses of big data, and its advantage-disadvantages.

What Makes Big Data Work?

  • Unstructured and structured big data are two types of big data.
  • Structured data is information that has already been stored in databases and spreadsheets by the company, and it is typically numeric in nature.
  • Unstructured data is unorganized data that does not fit into a predetermined model or format.
  • It includes information gleaned from social media sources that aid organizations in gathering information on customer demands.
  • Big data has publicly published remarks on social networks and websites.
  • The inclusion of sensors and other inputs in smart devices enables data to be collected in a wide range of settings and conditions.
  • Big data is frequently kept in computer databases and examined with software intended to deal with huge, complicated data sets.
  • Many software-as-a-service (SaaS) firms specialize in handling this kind of complicated data.

Big Data Applications

  • To assess whether a correlation exists, data analysts examine the link between several types of data, such as demographic data and purchasing history.
  • Such evaluations can be done in-house or by a third party that specializes in converting huge data into understandable representations.
  • Businesses frequently resort to such professionals to analyze huge data and turn it into usable information.
  • Data analysis findings can be used by nearly every department in a company, from human resources and technology to marketing and sales.
  • The purpose of big data is to speed up the time it takes for products to reach the market, minimize the time and resources needed to obtain market adoption, target audiences, and keep customers happy.

The Benefits and Drawbacks of Big Data

  • The growing amount of data available creates both benefits and challenges.
  • In principle, having more data about consumers (and future customers) should allow businesses to better personalize products and marketing efforts to ensure customer happiness and repeat business.
  • For the benefit of all stakeholders, companies that collect a huge amount of data are given the opportunity to undertake deeper and richer analyses.
  • While improved analysis is a good thing, huge data can also lead to overload and noise, which reduces its use.
  • Larger volumes of data must be handled, and it must be determined which data reflects signals against noise.
  • A critical issue is determining what makes the data relevant.
  • Furthermore, the data’s type and format may necessitate particular treatment before it can be used.
  • Structured data, which is made up of numeric values, is simple to store and sort.
  • Unstructured data, such as emails, movies, and text documents, may necessitate the employment of more advanced techniques before becoming helpful.

The Possible Solutions: Scaling Up VS Scaling Out

  • Modern applications are always changing, evolving to meet new objectives, and they operate in an environment with shifting resource demands.
  • You’re not just doing your application a disservice if you don’t know how to scale effectively; you’re also putting unneeded strain on your operations team.
  • Trying to figure out when to scale up or down by hand is really tough.
  • If you buy more infrastructure to handle high traffic, you may end up overspending when demand is low.
  • If you target your average load, spikes in traffic will have an influence on your application’s performance, and these resources will go unused when traffic lowers.
  • It may be required to enhance infrastructure to handle the increasing load when your cloud workload changes, or it may make sense to reduce infrastructure when demand is low.
  • The “up or out” portion is perhaps a little less obvious.
  • To spread out a load, scaling out means adding more functionally comparable components in parallel.
  • This would entail increasing the number of load-balanced web server instances from two to three.
  • This would include switching from a virtual server (VM) with two CPUs to one with three.

Conclusion:

  • So far in this article, we covered how does big data works, the uses of big data, and its advantage-disadvantages.
  • In the next article, we will learn in-depth solutions for Data Explosion using Hadoop.

--

--

--

One of India’s leading institutions providing world-class Data Science & AI programs for working professionals with a mission to groom Data leaders of tomorrow!

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Getting started with machine learning and pyautogui

The core of a datamining task

Second Midterm Dataset work & One Dataset Charted 10 Ways

Data Analysis (Top 10 Quotes)

Is the second stimulus package really a good idea?

Data Lakes: Top 10 Definitions

Deconstructing Time Series using Fourier Transform

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
INSAID

INSAID

One of India’s leading institutions providing world-class Data Science & AI programs for working professionals with a mission to groom Data leaders of tomorrow!

More from Medium

End to end MLflow model serving example on Databricks through REST API endpoint — Part 1

MLOps: Beginners Guide

Deploying your first ML model as a REST endpoint using Flask and Bootstrap.

Opyrator & Deployment Ready Microservice