Suggested Readings - Chapter 1: Data Mining
-iceberg_online(data);
2022-12-27(#15058703@0)+6
The most commonly accepted definition of “data mining” is the discovery of “models” for data.
-iceberg_online(data);
2023-9-11(#15658067@0)+1
However, more generally, the objective of data mining
is an algorithm.
-iceberg_online(data);
2023-9-16(#15665886@0)+1
Bonferroni’s Principle is really a warning about
overusing the ability to mine data.
-iceberg_online(data);
2023-9-16(#15665880@0)+1
if you look in your data for too many things at the same time, you will see things that look interesting, but are in fact simply statistical artifacts and have no significance.
-iceberg_online(data);
2023-9-16(#15666585@0)
To deal with applications such as these,a new software stack has evolved. These programming systems are designed to get their parallelism not from a “supercomputer,” but from “computing clusters” – large collections of commodity hardware, including conventional processors (“compute nodes”) connected by Ethernet cables or inexpensive switches.
-iceberg_online(data);
2023-9-17{311}(#15667470@0)
Concept developed by Google: (1), Achieve parallel computing on a large array of inexpensive machines; (2), Tolerant of H/W failures.
-iceberg_online(data);
2023-9-23(#15677664@0)