Release Notes

Release Notes

Release notes for the 1.2 version of the Spark connector.

Couchbase Spark Connector 1.2.1 GA (June 2016)

Version 1.2.1 is the second stable release of the 1.2.x series. It fortifies support for Spark 1.6 and brings the following enhancements and bug fixes:

Spark Core

  • The underlying SDK has been updated to 2.3.0
  • SPARKC-52: SaveMode on writes now doesn't bubble up errors if IGNORE is used.
  • SPARKC-51: Added JVM hook to eventually clean up connections properly.
  • SPARKC-56: Support for transparent retry on writes like on reads.

Spark SQL

  • SPARKC-50: The (META_)ID can now be any type which can be converted into a string instead of being an actual string. This allows for more flexibility in working with types in the first place.
  • SPARKC-48: Filters that are LIKE based now escape . and * chars.
  • SPARKC-53: Deeply nested attributes on filters are properly parsed and escaped.

Spark Streaming

  • SPARKC-55: The FromNow option now works, allowing you to start streaming from the current point in time without getting all the previous mutations first.

Couchbase Spark Connector 1.2.0 GA (May 2016)

Version 1.2.0 is the first stable release of the 1.2.x series. It brings support for Spark 1.6 and the following enhancements and bugfixes:

Spark Core

  • Support for Apache Spark 1.6.x
  • SPARKC-37: Both Java and Scala APIs now export Couchbase view, spatial view, and N1QL query APIs on RDDs in addition to the SparkContext.
  • SPARKC-46: Subdocument Lookups are supported on the RDDs and the SparkContext (only supported with Couchbase Server 4.5).

Spark SQL

  • SPARKC-41: Manual override of the schemaFilter has been fixed. It is now possible to define a filter like this:
    sqlContext.read
      .option("bucket","travel-sample")
      .option("schemaFilter", "type = 'airline'")
      .couchbase()
  • SPARKC-42: The SparkSQL count() operator now works, a bug has been fixed so that if SparkSQL doesn't pass in required columns they are transformed to a * query.
  • SPARKC-30: It is now possible to provide a manual schema as well as a custom schema filter.
  • SPARKC-43: The Java API can now use Spark SQL directly. See the Java API documentation for more information.
  • SPARKC-47: SparkSQL Filter expression support has been extended to support all Spark Filters, including nested ones.

Spark Streaming

  • The internal implementation has been updated to the latest release but is still experimental. Note that the FromBeginning is implemented, but FromNow still has some known issues which will be fixed in later releases. Also, cluster rebalance support is not yet available and will follow.