Posts

Showing posts from May, 2025

Applying Big Data to the Folding Protein Problem

  One unique problem that was helped by Big Data techniques is uncovering complex fold and structures in hundreds of protein families. Big Data techniques were used on a massive database of 2 billion genome sequences to decode the structures of over 600 protein molecules. This immense dataset provides a crucial foundation for understanding the intricate ways proteins assume their complex three-dimensional shapes. The data was processed using Rosetta@home which is a distributed computing platform. This led to a significant reduction in the cost and time required to decode the protein structures. This innovation has led to a rapid acceleration of scientists understanding of proteins, which helps with research in a range of scientific fields. In 2018 an AI model was then used to further develop this approach and increased the decoding of proteins by six times. AlphaFold2 uses deep-learning technology to increase to discovery of protein structures to incredible levels. Over 60 year...

Big Data Visualisation

With all this data and fancy data mining and data analysis techniques, how can the human at the end of the pipeline make any sense of it? Computers may be able to understand 1’s and 0’s but humans require a bit more imagination. This is where data visualisation comes in. Data visualisation is the process of taking raw information from the Big Data processes and creating graphical representations using things like charts, graphs, and maps. These techniques are essential for making sense of the overwhelming amount of data and information and being able to take action on the insights and predictions from the Big Data process. There are many techniques used in data visualisation such as: Maps – These come in many forms such as geospatial maps for integrating geographical data with analytical data, heat maps for showing a simpler view of large and complex amounts of data using density and magnitude, and tree maps for showing the part-to-whole relationships within a dataset. (1) Graphs – Gra...

Data Mining Methods

  As I have discussed throughout this blog, the among of data generated today and the rate that it is generated and mind boggling. Data mining is the process through which value is extracted from this endless sea of data. It involves sifting through massive datasets to uncover anomalies, patterns, and correlations that can help solve problems through data analysis. This process relies heavily on the effective implementation of data collection, warehousing, and processing. This process is only becoming more valuable with the growth of Big Data and data warehousing. (1) There are several powerful techniques that make up the process of effective data mining: Association Rules – This technique involves searching for relationships between variables. This relationship creates additional value within the data. Classification – This technique involves assigning objects to predefined classes. This groups objects by characteristic or some other common factor. This helps organise and su...

What is Big Data suited to?

  Though Big Data is having a huge impact on many areas of society, it is like all technologies a tool. Big Data is not beneficial in all areas and all instances and must be harnessed properly to take advantage of its abilities. So, what kind of problems is Big Data suited to? Well, this can be answered by looking at an example of how Big Data can help businesses. Big Data can help businesses understand customers better. It helps companies understand what customer like and what they want. This leads to better products and smarter, more targeted advertising. Which in turn helps the business with customer retention. Big Data can also help businesses by finding hidden patterns in the huge amounts of data that is collected. This helps businesses understand what’s popular, predict what might happen in the future, and even spot potential problems before they become big issues. Finally Big Data can help businesses work more efficiently, avoid delays, and waste less. By analysing a b...

Strategies For Limiting the Negative Effects of Big Data

Big Data offers incredible potential to impact many areas of society and as discussed in my previous posts, big impacts for individuals. But how do we ensure Big Data brings solutions and innovations and balance the impacts on privacy and democracy? One way that society can place crucial safeguards on Big Data is through legislation. The UK’s Data Protection Act and the EU’s General Data Protection Regulation (GDPR) are two examples of this kind of regulation. These regulations empower individuals by giving them more rights and control over their personal data. They enforce principles such as: ·          Ensuring data is collected only for specific reasons. ·          Minimising what data can be collected ensuring only necessary data is collected ·          Ensuring data is correct and up to date ·          Requiring ...

Implication of Big Data for Individuals

  In my last few posts, I have discussed all the ways in which Big Data is changing the modern world in the fields of business, science, health care etc. But another question that must be asked is how this effect people individually? As lots of this data is being generated by individuals and being collected and analysed on mass by companies and institutions how can this data be used to effect individuals? One way to explore these questions is to look at the Investigatory Powers Act of 2016. This controversial piece of legislation grants security services considerable powers in relation to data. The act provides security services with the powers to legally: Bug Devices – Upon receiving a warrant, security services can legally monitor private devices. Companies are also legally obligated to assist security services in doing so and bypass encryption if possible. Acquire Bulk Communications – Security services can legally acquire access to large datasets if a serious crime has ...

Limitations of Predictive Analysis

  In my blog post discussing traditional statistics, I discussed the idea of inferential statistics and the limitations of using a sample to make estimations about the populations the sample is drawn from. In the world of Big Data this translates to the concept of predictive analytics. This is when Big Data leverages huge data sets, statistical algorithms, and machine learning to make predictions of future outcomes. Predictive analytic models come in two types: classification and regression models. Classification models put data objects into categories and make predictions based on this whereas regressions models predict continuous data. A classification model may sort customers into categories and make predictions about how receptive they would be to marketing whereas a regression model would make predictions about how much money a customer will generate during their relationship with the company. (1) The idea of predictive analytics is becoming an increasing part of Big Data,...

Technological Requirements of Big Data

  For Big Data technologies to handle the huge quantities of endless data in today’s world there are three main technological requirements. These three requirements are storage, processing, and data integration. Storage – The volume of data being produced is already at staggering level and is growing at an exponential rate. For Big Data to capture and analyse this data, the first challenge is storing it. Storage solutions have evolved along side Big Data to meet the ever-increasing demand for storage with the rise of the cloud being the most important factor for increasing the capacity of businesses and organisations to store their data and to keep costs down. Data Lakes provide scalable platforms for storing data in all its different varieties. (1) Processing – With all that data being stored the next technological demand of Big Data technologies is processing power. As processing power has increased year on year, the ability to process huge amounts of data has increased in ...

Contemporary Applications of Big Data in Society

  In my last post I discussed the way in which Big Data has had an impact on business and science in today’s world. Because of the variety, volume, and velocity of data these days, Big Data is having an impact on almost all aspects of todays society. Business, science, healthcare, government, social media, entertainment, retail, etc. it seems data is having a revolutionary impact everywhere and to give an example of this I will discuss Smart Cities in this post. In 2025 it is estimated that 60% of the global population live in cities. Cities come with a lot of challenges and the concept of smart cities has been developed to use modern technology to address some of these challenges. A smart city is a place where traditional networks and services are made more efficient with the use of digital solutions for the benefit of its inhabitants and business. (1) This involves integrating various digital solutions and Big Data technologies into city infrastructure to improve servic...

Contemporary Applications of Big Data in Science

Another way in which Big Data is having an impact on the world is through science. Big Data is ideal in the field of scientific research changing how researchers gather, analyse, and interpret information. This allows better predictions and insights into research data and speeds up the process as well leading to improvements in many different areas of research such as: Healthcare – Big Data analysis is leading top breakthroughs in personalised medicine and treatments for patients making them more effective. It helps healthcare professionals with tracking, predicting, and reacting to disease outbreaks with greater efficiency and accuracy leading to a reduced impact. Big Data is also speeding up the process of drug discovery over traditional methods shortening the time between discovery and bringing the drug to the market. Environmental Science – Researchers are using Big Data to improve climate models, track biodiversity, and better understand and protect ecosystems. This is helpi...

Contemporary Applications of Big Data in Business

  Because of the proliferation of data and Big Data technologies in today’s world, Big Data is having a big impact of many areas of todays society. Modern Businesses are using Big Data in many ways: Enhancing Customer Experience – Businesses can take advantage of big data to provide customers with a better experience through personalized marketing and customer analytics. Personalized marketing allows businesses to understand individual customers preferences and behaviour better to tailor their marketing better to provide customers with what they want. Customer analytics allow businesses to understand the customers in their market better and tailor services and products to improve their businesses. Improving Operations – Big Data can be used by business to make their supply chains more efficient by identifying areas that can be improved and reduce waste through out the chain. Big Data can be incorporated into decision making processes to help businesses reduce risk while still...

Big Data Analytics

  In my last post I discussed traditional statistics are their uses to derive value from data. In the examples I discussed in that previous post, the data sets were small and structured. This I what defines Big Data from traditional data. Because of the volume, variety, and velocity of data in the modern world, traditional data analysis techniques don’t work, and the field of Big Data analytics is formed around the analysis of large data sets. Big Data analytics can be broken down into two categories, Decision-orientated and Action-orientated. Decision-orientated – Focuses on analysing data to provide insights to support overall strategy and decision-making. Focusing on trends and patterns emerging from data to inform current strategies and predict possible future outcomes. Action-orientated – Focuses on analysing data in real time to make immediate reactions with a focus on speed, reactivity and automation. Big Data technologies allow businesses to stay up-to-date and react ...

Traditional Statistics

  In my last post I discussed the value of data. In that post I discussed how value is extracted from data by processing raw data and packaging it but what does this mean? Data in its purest form is just points of information. Without context or analysis applied it doesn’t necessarily serve much use. One way in which human beings derive value from is data is through statistics. Traditional statistics is a tool for understanding and interpreting data. There are many ways in which this tool can be applied to data, with two main branches: descriptive and inferential . Both are ways to extract meaning from data but are used in different ways for different purposes. Descriptive Statistics Descriptive statistics are used to describe or summarise data in a meaningful way. They help us visualise what the data is showing and make it easier to recognise patterns that may emerge from the data. Descriptive statistics are applied to a sample and measures chosen properties of that sample t...

Value of Data

 In the modern world, data is a valuable resource. The value of data though can be harder to quantify compared to other more tangible resources such as equipment, real estate, or employees. Though these assets can be more easily digested in terms of cost vs. value, the impact data has had on the modern world in undeniable through cost reduction, revenue increase, and income generation. With data being infinitely produced at an ever-faster pace, the value of data is only increasing. Though data is being constantly produces in the modern, the type and quality of this data varies massively. Data alone doesn’t necessarily have the value that business and organisations are looking for and that is why Big Data technologies must be applied to extract value from the endless sea of data and gets that data to the right people at the right time. Using Big Data to extract value from data comes at a cost and how the end results balance out this cost can be understood through the 1-10-100...