I was struggling with this piece of code. I have one similar piece of code, and it was running perfectly. I have to go back and forth, delete gradually pieces and pieces of code to compare between the two.
Turn out that the mistake is really really small, as shown in line 38.
I'M MISSING THE =
The problem I got:
Message: Job aborted due to stage failure: Task 0 in stage 365.0 failed 4 times, most recent failure: Lost task 0.3 in stage 365.0 (TID 12524, lamar.homenet.telecomitalia.it): java.lang.ClassCastException
Continue reading “Missing “=” sign”
Message: Could not initialize class com.databricks.spark.csv.util.CompressionCodecs$
One line answer: make sure that the scala version is the same for
- the scala used to compile spark
- the spark-csv module
- the spark running your system
Continue reading “Cannot write to csv with spark-csv in Scala”
Problem: in Hive CLI, the simple command doesn’t return a result.
Solution: make sure you have at least one worker (or slave) for Spark Master
hive> select count(*) from subset1_data_stream_with_cgi;
Status: Running (Hive on Spark job)
Job Progress Format
CurrentTime StageId_StageAttemptId: SucceededTasksCount(+RunningTasksCount-FailedTasksCount)/TotalTasksCount [StageCost]
2016-06-30 15:09:54,526 Stage-0_0: 0/1 Stage-1_0: 0/1
2016-06-30 15:09:57,545 Stage-0_0: 0/1 Stage-1_0: 0/1
2016-06-30 15:10:00,561 Stage-0_0: 0/1 Stage-1_0: 0/1
Continue reading “Hive on Spark is not working”
Problem: Hive CLI turned off suddenly, and I cannot start Hive CLI again
java.sql.SQLException: Unable to open a test connection to the given database. JDBC url = jdbc:derby:;databaseName=/mnt/storage/DATA/hadoop/metastore_db;create=true, username = APP
Diagnosis: since Derby database allow only 1 connection to its database, it creates a *.lck in the folder databaseName above. So to this folder, and delete those *.lck file.
After I deleted dbex.lck and db.lck, then
hive can start as usual.