The Story of Internet Object

There is only one thing stronger than all the armies of the world: and that is an idea whose time has come.

The Story Behind Internet Object

In the summer of 2017, I was deeply involved in creating a REST API for a project, using JavaScript for the front end and Python for the back end. Over the years, I've developed many REST APIs, but this time was different. I noticed a big problem: our use of JSON was causing us to send too much unnecessary data between the client and server. JSON's requirement for key/value pairs seemed too wordy. I thought about how much bandwidth we could save if we could just send the essential values without the keys. This idea wasn't completely new to me; it had come up before but never so clearly or urgently. This time, I felt a strong need to fix this inefficiency and find a better solution.

When I started obsessing over this idea, I realized that the extra bytes from JSON keys were just the beginning. The fact that JSON doesn't enforce a schema by default brought up a lot of problems—issues with data validation, unclear data structures, longer development times, and higher costs. Think about the process of converting data to and from JSON: for just one API endpoint serving both desktop and mobile clients, the validation work can be huge. Data often needs to be validated six times—each time it's sent and received by the server, desktop, and mobile interfaces. Imagine how much effort, time, and money could be saved if validation were built in. While there are libraries and frameworks that can help, they still require extra work and can lead to a scattered validation process.

I think when you have a lot of jumbled up ideas they come together slowly over a period of several years.

The problems with JSON go beyond just the combination of keys and values and the lack of a default schema; they also involve the blending of data and headers (or metadata). JSON doesn't provide a clear way to separate these elements. For example, consider a situation where you need to return a large list of employee records with pagination. In these cases, the structure of your API's response can highlight this issue. See the following JSON response.

JSON Example

When you look closer at this JSON document, you can see a major issue: the lines between the actual data like employees and what could be seen as headers or metadata like result, count, currentPage, and pageSize etc are blurred. These elements are all mixed together in the same object set, making the structure confusing. Here, important information about the employee collection is mixed with details about pagination and result status, which makes things less clear and harder to work with. This simple example highlights JSON's core limitations in separating essential data from extra metadata, showing the need for a more structured and better solution.

The Quest for a Solution

This led me to think that JSON, despite being simple and easy to learn, might not be the best format for exchanging data on the Internet. JSON is user-friendly, straightforward, and intuitive, but it lacks some important features. It wasn't originally designed for the web, but rather borrowed from JavaScript for its simpler structure, making it a popular choice compared to more complex protocols like XML, SOAP, and RPC, or other unstructured methods like text parsing. JSON made data exchange on the web more accessible, replacing other dominant protocols. It has come a long way and achieved a lot in its use in RESTful APIs, but it still falls short in meeting some basic needs of the web.

After realizing these issues, I became even more determined to find a solution. I started exploring different formats, approaches, and mechanisms, including JSON, GraphQL, SGML, XML, SOAP, CSV, YAML, HTML, MIME, and others, focusing only on human-readable, text-based formats and avoiding binary options like Protobuf, MessagePack, Avro, etc. Even though not all of these were primarily for data interchange, each provided valuable insights and inspiration. From this extensive review, I identified key qualities that a new web serialization format must have to meet the changing needs of data interchange on the web.

Text-based and Human-friendly

It must be human-friendly and easy to work with. This includes being simple to read and understand, as well as providing clear and consistent formatting.

Compact and Readable

The solution must eliminate excess baggage in data exchange, such as redundant keys and unnecessary bytes, without compromising readability. This can be achieved through a more compact and efficient data format that removes unnecessary information and focuses on the data itself.

Schema First Design

It must enforce a schema, providing standard data validation and integrity assurance. This ensures that data is consistent and reliable, and reduces the burden on endpoints to perform their validation.

Platform-independent and Hardware Agnostic

It must provide an efficient way to store data on disk or transport it over the wire, such as through HTTP or other mediums. This includes support for different types of data storage and transport, such as binary and text formats.

Separate Meta Data (Headers)

The solution must keep the data separated from the metadata to ensure clarity and prevent errors in data interpretation. By separating metadata from data, it becomes easier to identify and handle the relevant information and reduces the risk of ambiguity in data transmission and processing.

Comments

Support the embedding of comments, allowing for easy documentation and understanding of the data.

Data Streaming

The format should inherently support data streaming, enabling the continuous flow of information. Unlike JSON, which requires encapsulating objects and impedes streaming separate records without auxiliary formats like JSONL, this new standard should facilitate the smooth and independent processing of streaming data.

Reusability - Variables and References

The solution should allow for the use of reusable values like variables and references. This would add meaningful names, reduce the size of data being transferred, protect sensitive information, and enable the repeated use of data.

Data Types

The solution must efficiently manage various data types including standard and raw strings, decimal, octal, hex, and binary numbers, dates, Infinity, NaN, booleans, and nulls, as well as objects and arrays, ensuring comprehensive and flexible data processing.

The Internet Object

During my research, the term "Internet Object" kept coming to mind. It seemed like a perfect name for a data object moving through the web and internet. Soon, I started calling this new concept "The Internet Object," reflecting its role in the digital world. After looking at various text-based data formats, I realized none of them fully matched what I had in mind. So, I decided to create a new solution to address the limitations of existing formats and provide a more efficient, effective, and user-friendly way to transfer data over the Internet. This led to the creation of the Internet Object.

From Concept to Reality

The Internet Object quickly became my passion project. I spent every spare moment developing new designs and structures, testing each version in different scenarios. I reached out to many developers and tech architects, gathering their feedback and refining my approach. After two years of curiosity, excitement, dedication, and research, by the end of 2019, I was ready to share my work with the community. The initial introduction of the Internet Object concept received an overwhelmingly positive response. While some skeptics questioned the need for a new format, many were interested and offered valuable suggestions for improvement. However, my work on the Internet Object was paused during 2020 and 2021 due to personal reasons.

In late 2022, I resumed my research on the Internet Object format, which I had begun in 2017. By February 2024, the first draft is mostly ready. You can already try out the idea in the Playground, and the TypeScript/JavaScript parser is making good progress. After all these trials and iterations, the final version of the Internet Object format has emerged as simple, intuitive, schema-first, and efficient. It simplifies the representation of text-oriented structured objects while incorporating advanced features such as streaming support, in-data variables and comments, embedded and external schema support, and data/metadata separation.

The following example will give you a glimpse of the Internet Object format through sample schema definitions (external) and a REST API response that resembles the JSON response mentioned earlier in this story.

Internet Object Schema

Here, the provided Internet Object schema defines two main entities: an address object and a default schema for an employee object, along with two variables representing genders.

Employee Object Schema

The address object schema, $address, includes fields for street, city, state, and zip code, with the zip code constrained to be a 5-digit number. The default schema for the employee object, $schema, includes fields for name, age, isActive status, gender, and an optional address. The age field is constrained to be a number between 20 and 60, the isActive field is a boolean, and the gender field is restricted to the values represented by the variables @m (male) and @f (female).

The REST API Response in Internet Object format!

The REST response is structured as an Internet Object and contains both metadata and data.

Employee Object Data

The metadata includes the result of the API call, the count of items returned, the current page number, the page size, and information about the previous and next pages. The data section lists employees, each with their name, age, isActive status, gender, and optionally, their address. The address, when provided, follows the structure defined in the $address schema. This response demonstrates how Internet Object can be used to provide a clear and structured representation of data in a REST API response.

In conclusion, the Internet Object format marks a leap forward in data serialization, offering a simple, efficient, and schema-first solution. It addresses the shortcomings of traditional formats and introduces innovative features for enhanced versatility.

Join the Movement

As we continue to develop Internet Object, we welcome the community's contributions to shape its future. Together, we can revolutionize data exchange and storage for a more streamlined and effective digital world.

Please check out the specifications by visiting this link: https://docs.internetobject.org ↗.

Join us in collaborating, reviewing, and discussing how we can further enhance the Internet Object format for the future of the web. Whether you're a developer eager to contribute parsers and tools, or a tech enthusiast keen on spreading the word and offering constructive feedback, we invite you to join this movement. Together, let's make web data exchange quicker, more efficient, and user-friendly than it's ever been!

Together, let's make data exchange format better, more efficient, and useful than it's ever been! As William Gibson aptly put it, The future is already here — it's just not very evenly distributed. Let's work together to spread the future of data serialization.

Thank you for taking time and reading this story!

Best Regards
Mohamed Aamir Maniar
https://linkedin.com/in/aamironline ↗

Connect with Us!

Share your insights, learn from others, and contribute to the evolution of Internet Object. Your expertise can help shape the future of data serialization. Learn more and join us on our community page!