ALL


  Hologres vs AWS Redshift

Hologres and Redshift are both data warehousing solutions, but they have some differences in terms of features, architecture, and target use cases.Underlying Infrastructure Hologres: Built on Alibaba Cloud's Apsara distributed computing platform, Hologres leverages the underlying infrastructure for storage, computation, and management. It benefits from Alibaba's expertise in cloud-native architecture and real-time data processing. Redshift: Amazon Redshift is based on a Massively Parallel Processing (MPP) architecture, designed to distribute and parallelize queries across multiple nodes for fa...

2,510 0       ALIBABA BIG DATA AWS HOLOGRES REDSHIFT REAL-TIME


  Cracking the Data Lineage Code

What is Data Lineage? Data lineage describes the life-cycle of data, from its origins to how it is manipulated over time until it reaches its present form. The lineage explains the various processes involved in the data flow of an organization and the factors that influence each process. In other words, data lineage provides data about your data. Data lineage helps organizations of all sizes handle Big Data, as finding the creation point of the data and its evolution provides valuable insights.Almost every decision can be helped by data lineage, from a software engineer choosing what...

1,363 0       BUSINESS BIG DATA DATA LINEAGE


  How Kafka achieves high throughput low latency

Kafka is a message streaming system with high throughput and low latency. It is widely adopted in lots of big companies. A well configured Kafka cluster can achieve super high throughput with millions of concurrent writes. How Kafka can achieve this? This post will try to explain some technologies used by Kafka.Page cache + Disk sequential writeEvery time when Kafka receives a record, it will write it to disk file eventually. But if it writes to disk every time it receives a record, it would not have very good performance. In fact, Kafka has a fantastic design here which is it utilizes the pag...

8,972 0       BIG DATA KAFKA


  Data Scientists and Their Harder Skills than Big Data

The field of data science is often confused with that of big data. Data science is an aid to decision makers in a company with a logical approach. Who is a Data Scientist? A Data Scientist reviews a huge collection of data(that may extend to a couple of terabytes of disk space or thousands of excel sheets). This humongous chunk of data is not feasible for being handled, sorted and analyzed by a single person.Here we require the help of data science, and most recently, the field of Artificial Intelligence has gained considerate limelight. With the use of efficient algorithms we can so...

3,494 0       BIG DATA


  How Google Utilizes Big Data for SERP

Google is an expert when it comes to big data. This is evident in their development of various techniques and open source tools which are used by the big data industry professionals. These tools and technique allow Google to sift through millions of different websites and enormous amounts of data in order to provide users with correct answers in a matter of milliseconds. But how does Google accomplish that with such precision? To answer that, we need to focus on the complex activities that go on behind every search query.Entering the search queryGoogle has always wanted to make a search engine...

2,428 0       GOOGLE BIG DATA


  Video website in big data era

Big data initially means the large data set which is not able to be analyzed, but later it was derivatized to the method to analyze huge amounts of data in  order to gain great value.This is a form which gradually gets attention, It's difficult to analyze these data and it's also difficult to store these data and it needs some unprecedented way, Now in China many companies use the open source Hadoop distributed data cluster to meet the needs of data statistics.Since we can get segmented data with great value in huge data through data analysis, so many websites have big data strategy, espe...

7,515 0       NETFLIX BIG DATA DATA MINING