Questions and answers on big data on Stack Overflow

 

Stack Overflow is a question and answer site. It's 100% free, no registration required.

 

Big Data is a blanket term for any collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, curation, storage, search, sharing, transfer, analysis and visualization.

 

Data sets grow in size in part because they are increasingly being gathered by ubiquitous information-sensing mobile devices, aerial sensory technologies, software logs, cameras, microphones, radio-frequency identification (RFID) readers, and wireless sensor networks. What is considered "big data" varies depending on the capabilities of the organization managing the set, and on the capabilities of the applications that are traditionally used to process and analyze the data set in its domain (from Wikipedia, the free encyclopedia).

 

Fetching 1 million data records from SQL Server and store it in mongo db

Tue, 24 October 2017 10:34 +0000 GMT

Hadoop YARN SLS (Scheduler Load Simulator)

Tue, 24 October 2017 10:26 +0000 GMT

Bigdata,kafka, hadoop clouderamanager

Tue, 24 October 2017 07:10 +0000 GMT

Cassandra two dimensional data modelling

Tue, 24 October 2017 03:51 +0000 GMT

Finding mutual friends for a huge dataset

Tue, 24 October 2017 02:20 +0000 GMT

Sqoop: java.lang.RuntimeException: Can’t parse input data: ’

Tue, 24 October 2017 00:56 +0000 GMT

Transforming one row into many rows using Amazon Glue

Mon, 23 October 2017 21:10 +0000 GMT

What is the speed or accuracy difference between number crunching in python vs. nodejs? [on hold]

Mon, 23 October 2017 20:26 +0000 GMT

Big Data Transformation | Reverse | One to Many | Many to One

Mon, 23 October 2017 15:57 +0000 GMT

R - Big data: generalized linear mixed-effects models

Mon, 23 October 2017 15:11 +0000 GMT

Timestamp precision in Cassandra

Mon, 23 October 2017 13:05 +0000 GMT

Mapreduce: My reducer function not working (returning top n common names)

Mon, 23 October 2017 08:16 +0000 GMT

Handle huge array with php and mongo

Mon, 23 October 2017 07:49 +0000 GMT

data mining with unstructured data how to implement?

Mon, 23 October 2017 04:21 +0000 GMT

Can't figure out this SQL from a survey

Sun, 22 October 2017 19:30 +0000 GMT

Mapper code processing spaces as well

Sun, 22 October 2017 16:23 +0000 GMT

Unable to start secondarynamenode, datanode, nodemanager while starting hadoop

Sun, 22 October 2017 14:17 +0000 GMT

Run hadoop mapreduce program in python

Sat, 21 October 2017 20:32 +0000 GMT

Data base column insertion when the strings of the columns do not match fully

Sat, 21 October 2017 15:01 +0000 GMT

Pivoting 1,620 columns to rows in 360gb text file in aws

Sat, 21 October 2017 06:56 +0000 GMT

efficient algorithm for computing quantiles in terabytes dataset

Fri, 20 October 2017 18:51 +0000 GMT

Mongoid pluck in batches

Fri, 20 October 2017 09:30 +0000 GMT

Big data and cloud computing [on hold]

Thu, 19 October 2017 23:20 +0000 GMT

Large-scale volume rendering and visualization libraries for terabyte-size data

Thu, 19 October 2017 14:39 +0000 GMT

reading a 25 GB nested json file with jsonlite in R

Thu, 19 October 2017 09:50 +0000 GMT

To continue legacy, can we implement Star Schema in Hive?

Thu, 19 October 2017 09:08 +0000 GMT

How to handle large amouts of data in tensorflow?

Wed, 18 October 2017 23:02 +0000 GMT

Mongodb Atlas alert: Query Targeting: Scanned Objects / Returned has gone above 1000

Wed, 18 October 2017 09:07 +0000 GMT

Is it possible to create a hive table with text output format?

Wed, 18 October 2017 07:03 +0000 GMT

java.lang.OutOfMemoryError in Spark Job for StringBuffer.append()

Wed, 18 October 2017 02:54 +0000 GMT