Some of the ETL tools provided by Hadoop are: The data sources can refer to databases, machine data, web APIs, relational databases, flat files, log files, and RSS (RDF Site Summary) feeds, to name a few. They enable the connection of various data sources to the Hadoop environment. The Hadoop ecosystem provides a variety of open-source technologies tailored for the purpose of ETL. In such cases, the data is copied into different systems to fulfill each purpose. The data is essentially the same in both cases, but it is used to serve different purposes. This same customer data is also used for further analysis and processing to identify buying patterns in the customers so that companies can handle their inventory accordingly. For example, customer data is important for companies to track orders and ensure that their customers receive these orders. Data is said to be collected from multiple sources and represented in a destination in a different manner or in a different context than the data in the sources.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |