Image for post
Image for post
Image by David Schwarzenberg from Pixabay

And why we need data management, data literacy and data analytics

Data has become such a common word that many of us have probably never thought about its exact definition. What first pops up in our mind about data is most likely a spreadsheet, a table, or a chart, that comprises numbers and labels. When everyone talks about big data, it becomes even more abstract as an enormous number of bytes floating through the devices and servers and requires programs to decipher them. While data can be understood by machines, it has lost most of its meaning to humans when stored in a file or table. We rely on other people…


Image for post
Image for post
Image by Qimono from Pixabay (CC0)

Back in 1958, Han Peter Luhn, a researcher at IBM, initiated the concept of Business Intelligence (BI), using the definition from Webster’s Dictionary: to apprehend the interrelationships of presented facts in such a way as to guide action towards a desired goal. Given its definition, Business Intelligence is indeed a vision. It should not be represented by the tools or technologies designed at some given time. In other words, it should be viewed as a company’s strategic vision for transforming data assets into business insights to make data-driven decisions.

Historically, there were two eras that revolutionized and popularized the concept…


Image for post
Image for post
Photo by xdfolio via pixabay (CC0)

Data management, including meta-data management, data governance, master data management, has been advocated since the beginning of the data warehousing era in the 1980s. It, however, has been hard to be implemented or enforced. Without it, an enterprise can still survive with their data warehousing projects. The project and organizational silos, however, would have introduced data duplications, inefficiencies, and no apparent source of truth, which have been a headache to many organizations, in particular those that own a lot of data with multiple systems.

With the recent rise of data privacy issues followed by several regulations put in place, it…


Image for post
Image for post
Photo via Pixabay

In the new era of Big Data and Data Sciences, it is vitally important for an enterprise to have a centralized data architecture aligned with business processes, which scales with business growth and evolves with technological advancements. A successful data architecture provides clarity about every aspect of the data, which enables data scientists to work with trustable data efficiently and to solve complex business problems. It also prepares an organization to quickly take advantage of new business opportunities by leveraging emerging technologies and improves operational efficiency by managing complex data and information delivery throughout the enterprise.

When compared with information…


Image for post
Image for post

What NoSQL databases can do while a Relational database cannot

NoSQL database is more and more popular in the modern data architecture. It has become a powerful way to store data in a specialized format that yields fast performance for a large amount of data. There have been many NoSQL databases available on the market, while new ones are still emerging. The most popular categorization consists of 4 types: Wide Column, Document, Key-value Pairs, and Graph. Among many NoSQL databases, below lists a few popular ones:

  • Wide Columnar: Cassandra, HBase, AWS DynamoDB
  • Document: Couchbase, MongoDB, Azure Cosmos DB, AWS DynamoDB
  • Graph: Neo4J, Azure Cosmos DB, TigerGraph, AWS Neptune
  • Key-Value Pairs…

Image for post
Image for post

With rapid advances in AI and data science, data has become an essential asset to every enterprise. Setting up a data strategy, therefore, has become every enterprise’s mission, particularly in the C Suite and at Executive levels. What is a data strategy and how do we create the right data strategy? I would like to dedicate this article to answer these 2 questions.

Before discussing data strategy, we need to understand what a strategy is. Using a simplified definition, a strategy is a thoughtful plan focused on changing the current state in order to reach a vision for the future…


Image for post
Image for post

The evolution of the technologies in Big Data in the last 20 years has presented a history of battles with growing data volume. The challenge of big data has not been solved yet, and the effort will certainly continue, with the data volume continuing to grow in the coming years. The original relational database system (RDBMS) and the associated OLTP (Online Transaction Processing) make it so easy to work with data using SQL in all aspects, as long as the data size is small enough to manage. …


The practice of Design Patterns is most popular in Object-Oriented Programming (OOP), which has been effectively explained and summarized in the classic book “Design Patterns: Elements of Reusable Object-Oriented Software” by Erich Gamma and Richard Helm. Below is the definition of Design Pattern from Wikipedia:

“A software design pattern is a general, reusable solution to a commonly occurring problem within a given context in software design. It is not a finished design that can be transformed directly into source or machine code. It is a description or template for how to solve a problem that can be used in many…


Image for post
Image for post

Several years ago, I met a senior director from a large company. He mentioned the company he worked for was facing data quality issues that eroded customer satisfaction, and he had spent months investigating the potential causes and how to fix them. “What have you found?” I asked eagerly. “It is a tough issue. I did not find a single cause, on the contrary, many things went wrong,” he replied. He then started citing a long list of what contributed to the data quality issues — almost every department in the company was involved and it was hard for him…


I started my career as an Oracle database developer and administrator back in 1998. Over the past 20+ years, it has been amazing to see how IT has been evolving to handle the ever growing amount of data, via technologies including relational OLTP (Online Transaction Processing) database, data warehouse, ETL (Extraction, Transformation and Loading) and OLAP (Online Analytical Processing) reporting, big data and now AI, Cloud and IoT. All these technologies were enabled by the rapid growth in computational power, particular in terms of processors, memory, storage, and networking speed. …

Stephanie Shen

Data and Technology Executive, #BigData #ML #Analytics # DataGovernance, also love photography and travel.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store