All | (Big) Data Analytics | Algorithms | Artificial Intelligence | Bayesian Networks | Business & Strategy | Computer Vision | Data Journalism | Data Mining | Data Scientists | Data Structures | Data Visualization | General Data Science | Hadoop, MapReduce | Information Retrieval | Linear Regression | Linguistic | Machine Learning, Predictive Analytics | Math | Metadata | Natural Language Processing | Network Science | Other Sites with Free Data Science Resources | Probability | Python | R | Singularity/Transhumanism | Statistics | Text MiningHadoop Big Data Analysis Framework.
Hadoop Explained: An introduction to the most popular big data platform in the world.
Mastering Apache Spark.
Programming Pig. Covers almost every feature of Pig: different modes it can be run in, complete coverage of the Pig Latin language, and how to extend Pig with your own User Defined Functions (UDFs)
Spark: The Definitive Guide. A deep dive into how Spark runs on a cluster; detailed examples in SQL, Python and Scala; Structured Streaming and Machine Learning; examples of GraphFrames and Deep Learning with TensorFrames
The Free Hive Book. A free electronic book about Apache Hive. The book is geared towards SQL-knowledgeable business users with some advanced tips for devops
Data-Intensive Text Processing with MapReduce. Scalable approaches to processing large amounts of text with MapReduce
Field Guide to Hadoop. An introduction to Hadoop, its ecosystem and aligned technologies