New Oil that Business Needs to Run

Data has been compared to a well-known phrase: new oil that business needs to run. IBM CEO Ginni Rometty explains it on the World Economic Forum in Davos in 2019, “I think the real point to that metaphor,” Rometty said, “is value goes to those that actually refine it, not to those that just hold it.”
Another view of data came from Alphabet CFO Ruth Porat. “Data is actually more like sunlight than it is like oil because it is actually unlimited,” she said during a panel discussion in Davos. “It can be applied to multiple applications at one time. We keep using it and regenerating.”
AI is Booming yet the Data Labeling Behind is Inefficient
Many industries are actively embracing AI to speed their structural transformation. From autonomous driving to drones, from medical systems that assist in diagnosis to digital marketing, AI has empowered more and more industries.

Yann LeCun is the inventor of convolutional neural networks (CNN), one of the key elements that have spurred a revolution in AI in the past decade. As a Turing Award winner, LeCun once expressed that developers needed labeled data to train the AI model and more quality labeled data brought more well-performed AI systems.
While the AI industry is booming, a large number of data providers have poured in. Generally, there are two types of data service companies.
In-house
The company recruits a certain number of data labelers(300–500 people), gets them trained on each specific task, and distributes the work to different teams.

Subcontract
The company subcontracts the labeling project to smaller data factories. The subcontractors or data factories are usually located in Asia due to cheap labor costs. When the subcontractors complete the first round QC, the company collects the labeled datasets and transfers them to another partner who goes through QC again. After, the company delivers the data results to Party A company.
Such a traditional working process is inefficient as it takes longer processing time and higher overhead costs. ML companies are forced to pay high while the small labeling factory could hardly benefit.
Trending AI Articles:
2. Generating neural speech synthesis voice acting using xVASynth
ByteBridge: an Automated Data Annotation Platform to Empower AI
ByteBridge has made a breakthrough with its automated data labeling platform in order to empower data scientists and AI companies in an effective and engaging way.
On ByteBridg’s dashboard, developers can set labeling rules directly, check the ongoing process simultaneously on a pay-per-task model with a clear estimated time and price.
Efficiency
- The real-time QA and QC are integrated into the labeling workflow as the consensus mechanism is introduced to ensure accuracy.
- Consensus — Assign the same task to several workers, and the correct answer is the one that comes back from the majority output.
- Through the dashboard, all the simplified tasks are distributed to workers who have passed the training exam. Once the output is qualified by a consensus mechanism, workers will get instant-paid. In this way, the labelers get motivated.

- ByteBridge.io owns millions of registered workers with 2D boxing daily outputs to 100,000
“We focus on addressing practical issues in different application scenarios for AI development through a one-stop, automated data platform. Data labeling industry should take technology-driven as core competitiveness with efficiency and cost advantage,” said Brian Cheong, CEO, and founder of ByteBridge.
End
It is undeniable that data has become a rare and precious resource. ByteBridge has realized the magic power of data and aimed at providing the best data labeling service to accelerate the development of AI with accuracy and efficiency.
Don’t forget to give us your ? !



How to Make Data Annotation More Efficient? was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.
