Impulse Technology & Architecture

What is Metadata and Why Does it Matter?

Estimated Reading Time: 6 minutes



by Stephanie Schuldes

Face of a man looking intently at several layers of translucent text and diagrams

Where do invoices, emails, and documents end up first? In a folder. But any clever structure reaches its limits when several heads are working together: Where is that document again? I need it for my project. Just make a copy … And so the chaos takes its course.

In the end, only one thing matters: that everyone gets the information they need as quickly as possible, regardless of where it is. Employees spend an average of two hours per day searching for information, according to the Value of Data study, conducted by Vanson Bourne for Veritas. They could have spent that time more wisely working with that information.

Metadata is one way to improve the efficiency of daily tasks. What appears to be fancy is, in fact, something that everyone is familiar with from everyday life.

What is metadata?

Metadata is information that describes and contextualizes a piece of content. Because metadata can take many forms, this definition is intentionally broad. In general, metadata is classified into three types:

  • Descriptive metadata makes it easier to find and understand the content.
  • Administrative metadata aids in the correct management and storage of content in accordance with rights and regulations.
  • Structural metadata describes how different parts of the content interact with one another.

Metadata is not limited to documents or technical data. It is used in everyday situations to categorize things. Metadata includes information such as labels on boxes during a move, musical genres, and the date and size of a photo file.

Types of metadata

The National Informational Standards Organization (NISO) lists the following types of metadata:

  • descriptive metadata
    • helps find or understand a resource by providing information about the content
    • example: title, author, subject, genre, publication date
    • primary use: discovery, display, interoperability
  • administrative metadata
    • technical metadata
      • provides information needed to decode and render the resource
      • example: file type, file size, creation date/time, compression scheme
      • primary use: interoperability, digital object management, preservation
    • preservation metadata
      • helps with long-term storage of digital resources
      • example: checksum, hash, preservation event
      • primary use: interoperability, digital object management, preservation
    • rights metadata
      • provides information about the intellectual property rights that regulate the use of the resource and its content
      • example: Creative Commons license
      • primary use: interoperability, digital object management
    • structural metadata
      • provides information about how different parts of one or more resources are related to each other
      • example: page sequence of a document, table of contents of a document, connection of different resolutions of identical content
      • primary use: navigation
  • markup languages
    • combine metadata and data/content
    • example: formatting in a document, tagging of words with semantic information (e.g. place, part of speech)
    • primary use: navigation, interoperability
Text and document icons over laptop with typing hands

What are the benefits of metadata over folders?

Folders are not as adaptable as metadata. A traditional folder-based approach organizes content into labeled bins for specific purposes. This makes it difficult to locate it in another context. You might need the same information for another project. A colleague from another department might benefit from it in a completely different context. In this case, one of two things can happen: you either don’t use the content (because you forgot or didn’t know it existed), or you create a duplicate in your current working directory. And perhaps your colleague does the same. As a result, the content ends up in multiple locations, requiring more storage space and making version tracking impossible.

In a metadata-based approach, it makes no difference where the content is stored. When you perform a search that includes the metadata associated with the content, it will appear in the results.

Why is metadata important?

Metadata helps solve typical problems that occur every day when working with files, such as:

  • What is the most recent version? And where can I find it?
  • Where should I store this? Which is the right folder?
  • Who made the edits to this document?

With a metadata-centric approach, you only specify what type of information you want to store in the system, not where. This means you don’t need to think about folder structures or where you (or others) might need this piece of information in the future.

You also don’t need to save duplicates in different locations. All you need to do is save as many details about the piece of information as you think are important for identifying it later.

Benefits of metadata

  • Find information more easily. Metadata makes it easy to narrow down search results depending on who is looking for information. Users can flexibly combine search criteria so that they fit their needs. This allows them to quickly access the information they need without having to know where exactly the content is stored. This method of accessing information also allows users to discover relevant content that they would not have found otherwise. It provides them with a broader view and more possibilities.
  • No duplicates. You can access files via search and filters, no matter where they are located in the system.
  • Content in context. Metadata allows to organize content more organically. It allows to link pieces of content, even if they are stored in different repositories or applications. This context allows different people to access and use content the way they need to.
  • Better search. Search results are more precise since you can narrow down your search with categories relevant to the content itself (instead of searching through folders named after projects etc.).
  • Efficient authorization. Permissions used to be tied to folders. This also meant that they needed to be changed on that level. Metadata allows to implement permissions via roles. This ties permissions to user management and makes it much easier to control access rights for entire groups of users.

How is metadata created?

Ideally, metadata is created at the time of capturing the content in the system. Otherwise, the collection may grow too quickly and become difficult to manage.

Metadata can be added manually, semi-automatically, or fully automatically using machine learning algorithms.

  • manual: Users add metadata by hand
    • pro: accurate
    • con: lots of work, tedious, time-consuming
  • semi-automated: The system makes automatic suggestions and users accept or correct them.
    • pro: makes capture easier for users
    • con: still needs human interaction
  • machine learning: An algorithm tags content automatically.
    • pro: the least manual effort for users, accuracy can be very high, fast, users have more time for other tasks
    • con: needs to be trained before it can be used, errors may occur

How do you work with metadata?

Metadata is data that contains information about other data. It allows you to get a sense of what a document or other piece of content is about before you open it – more than just the file name or location. It informs you about specific aspects of the actual content. Because it is structured, it helps to classify content into categories, which can then be used to narrow or filter the results of a search.

Metadata also expands the number of ways to access content. If you don’t know what to look for in one category, try another or combine several to narrow your search. It also implies that there are numerous combinations of search terms or filters that may include the same piece of content in completely different contexts that share some characteristics. Even if you had not considered looking for that particular piece of content, the information may be useful for your current task.

Moreover, an information management system allows to implement metadata-based features that improve daily efficiency and productivity:

  • Workflow automation handles repetitive tasks in business processes.
  • Permissions management via roles and rights improves information security and compliance.
Man surfing on a wave of documents

How do you define a metadata structure suited for your organization?

While some categories are more or less universal, others are highly specific to a line of business or even a single organization. A metadata structure can be customized to fit the need of any organization. This will probably take some effort, but the rewards are high.

Conclusion

Huge amounts of data are created every day. Finding what you need is becoming increasingly difficult and time-consuming. This has a negative effect on efficiency. Without the right information, making business decisions is difficult.

Metadata provides information about other data. It helps us find information by giving structure to otherwise unstructured data or information.

Metadata allows to access assets in various ways and for various purposes. By making them easily accessible, it helps reduce processing times (and time to market). By providing specific facets of the assets for search and grouping, it offers new perspectives on the data in the system. This allows it to be used in new contexts.

Metadata is critical for maximizing the value of digital assets because it makes them easier to find, understand, and use.