Skip to content

Releases: SciRuby/daru

Improvements and bug fixes

08 Aug 09:16
eb2380d
Compare
Choose a tag to compare
Pre-release

Bug fixes :

Small fixes and improvements

02 Jul 18:05
Compare
Choose a tag to compare
  • Minor Enhancements

    • Allow pasing singular Symbol to CSV converters option (@takkanm)
    • Support calling GroupBy#each_group w/o blocks (@hibariya)
    • Refactor grouping and aggregation (@paisible-wanderer)
    • Add String Converter to Daru::IO::CSV::CONVERTERS (@takkanm)
    • Fix annoying missing libraries warning
    • Remove post-install message (nice yet useless)
  • Fixes

    • Fix group_by for DataFrame with single row (@baarkerlounger)
    • #rolling_fillna! bugfixes on Daru::Vector and Daru::DataFrame (@mhammiche)
    • Fixes #include? on multiindex (@rohitner)

Gradual improvement on a road to 1.0

07 Nov 22:34
Compare
Choose a tag to compare

We are currently working hard on a proper version 1.0 release, with daru-io integration, full codebase cleanup and a lot of cool things.

In the meantime, here is 0.2.0!

  • Major Enhancements

  • Minor Enhancements

    • Allow Vector#count to be called without param for category type Vector (@rainchen)
    • Add option to DataFrame#vector_sum to skip nils (@parthm)
    • Add installation instructions to README.md (@koishimasato)
    • Add release policy documentation (@baarkerlounger)
    • Set index as DataFrame's default x axis for nyaplot (@matugm)
  • Fixes

    • Fix DataFrame/Vector#to_s when name is a symbol (@baarkerlounger)
    • Force Vector#proportions to return float (@rainchen)
    • DataFrame#new creates empty DataFrame when given empty hash (@parthm)
    • Remove unnecessary backports dependencies (@zverok)
    • Specify minimum packable dependency (@zverok)
    • Preserve key/column order when creating DataFrame from hash (@baarkerlounger)
    • Fix DataFrame#add_row for DF with multi-index (@zverok)
    • Fix Vector#min, #max, #index_of_min, #index_of_max` (0.1.6 regression) (@athityakumar)
    • Integrate yard-junk into CI (@rohitner)
    • Remove Travis spec restriction (@zverok)
    • Fix tuple sorting for DataFrames with nils (@baarkerlounger)
    • Fix merge on index dropping default index (@rohitner)

Major IO upgrades, fixes and minor API changes.

09 Aug 13:48
Compare
Choose a tag to compare

Here's the full changelog of this release (all thanks to @baarkerlounger for the hard work!):

  • Major Enhancements

    • Add support for reading HTML tables into DataFrames (@athityakumar)
    • Add support for importing remote CSVs (@athityakumar, @anshuman23)
    • Allow named indexes (@Shekharrajak)
    • DataFrame GroupBy returns MultiIndex DataFrame (@Shekharrajak)
    • Add new functions to Vector: max, min, index_of_max, index_of_min, max_by, min_by, index_of_max_by, index_of_min_by (@athityakumar)
    • Add summary to DataFrame and Vector without reportbuilder (@ananyo2012)
    • Add support for missing data for where clause (@athityakumar)
  • Minor Enhancements

    • Allow inserting or updating DataFrame vectors with single values (@baarkerlounger)
    • Add a boolean converter to the CSV importer (@baarkerlounger)
    • Fix documentation of replace_values method (@kojix2)
    • Improve HTML table code of DataFrame and Vector (@Shekharrajak )
    • Support CSV files with empty rows (@baarkerlounger)
    • Better DataFrame and Vector to_s methods (@baarkerlounger)
    • Add support for histogram to Vector moving average convergence-divergence (@parthm)
    • Add support for negative arguments to Vector.lag (@parthm)
    • Return Nyaplot instance instead of nil for Nyaplot Vector, Category and DataFrame (@Shekharrajak)
    • Add global configurable error stream which allows error stream to be silenced (@sivagollapalli)
    • Rubocop update and cleanup (@zverok)
    • Improve performance of DataFrame covariance (@genya0407)
    • Index [] to only take index value as argument (@ananyo2012)
    • Better error raised when Vector is missing from DataFrame (@sivagollapalli)
    • Add default order for DataFrame (@athityakumar)
    • Add is_values to Index (@Shekharrajak)
    • Improve spec style in IO/SQL data source spec (@dshvimer)
    • Open SQLite databases by bath (@dshvimer)
    • Remove unnecessary whitespace (@Shekharrajak)
    • Remove the .svg from Travis CI build link (@athityakumar)
    • Fix Travis CI icon in README (@athityakumar)
    • Replace is_nil?, not_nil? with is_values (@lokeshh)
    • Update contributing documentation (@v0dro)
  • Fixes

    • Fix missing axis labels for categorized scatter plot with Gruff (@xprazak2)
    • Fix NMatrix Vector initialization when Vector has nils and no nm_type is given (@baarkerlounger)
    • Fix head/tail methods on DataFrames with DateTime indexes and on Vector_at splat calls (@baarkerlounger)
    • Fix empty DateTime Index (@zverok)
    • Fix where clause when data contains missing/undefined values (@Shekharrajak)
    • Fix apply_scalar_operator spec (@athityakumar)
    • Change nil check to respond_to operator check for apply_scalar_operator (@athityakumar)
    • Make where compatible with is_values (@athityakumar)
    • Fix vector is_values method (@athityakumar)

Bug fixes, enhancements and some API changes.

31 Jan 12:24
Compare
Choose a tag to compare

This release introduces the following changes:

  • Major Enhancements
    • Add Daru::Vector#group_by (@lokeshh).
    • Add rspec-guard to run tests automatically (@lokeshh).
    • Remove Daru::DataFrame implicit Hash method since Dataframes are not implicit hashes and having an implicit converter can introduce unwanted side effects. (@gnilrets)
    • Add Daru::DataFrame#union. (Tim)
  • Minor Enhancements
    • Added a join indicator. (@gnilrets)
    • Support an enumerable value as an index of a vector. (Yuichiro Kaneko)
    • Add test case for NegativeDateOffset. (Yuichiro Kaneko)
    • Add test case for #on_offset?. (Yuichiro Kaneko)
    • NegativeDateOffset#- returns DateOffset. (Yuichiro Kaneko)
    • Make Vector#resort_index private because its only use was for internal usage in Vector#sort. (Yuichiro Kaneko)
    • Add DataFrame#order= method to reorder vectors in a dataframe. (@lokeshh)
    • Use Integer instead of Fixnum throughout the gem. (Yuichiro Kaneko)
    • Improve error message of Daru::Vector#index=. (@lokeshh)
    • Deprecate freqs and make frequencies return a Daru::Vector. (@lokeshh)
    • DataFrame#access_row with integer index. (Yusuke Sangenya)
    • Add method alias for comparison operator. (Yusuke Sangenya)
    • Update Nokogiri version. (Yusuke Sangenya)
    • Return Daru::Vector for multiple modal values for Daru::Vector#mode. (baarkerlounger)
  • Fixes
    • Fix many to one joins. The prior version was shifting values in the left dataframe before checking whether values in the right dataframe should be shifted. They both need to be checked at the same time before shifting either. (@gnilrets)
    • Support formatting empty dataframes. They were returning an error before. (@gnilrets)
    • method_missing in Daru::DataFrame would not detect the correct vector if it was a String. Fixed that. (@lokeshh)
    • Fix docs of contrast_code to specify that the default value is false. (@v0dro)
    • Fix occurence of SystemStackError due to faulty arguement passing to Array#values_at. (@v0dro)
    • Fix DataFrame#pivot_table regression that raised an ArgumentError if the :index option was not specified. (@zverok)
    • Fix DateFrame.rows to accept empty argument. (@zverok)
    • Fix bug with false values on dataframe create. DataFrame from an Array of hashes wasn't being created properly when some of the values were false. (@gnilrets)
    • Fix Vector#reorder! method. (Yusuke Sangenya)
    • Fix DataFrame#group_by for numeric indexes. (@zverok)
    • Make DataFrame#index= accept only Daru::Index. (Yusuke Sangenya)
    • DataFrame#vectors= now changes the name of vectors contained in the internal @data variable. (Yusuke Sangenya)

Categorical data support. Performance improvements.

19 Aug 11:31
Compare
Choose a tag to compare

0.1.4 (19 August 2016)

  • Major Enhancements
    • Added new dependency 'backports' to support #to_h in Ruby 2.0. (@lokeshh)
    • Greatly improve code test coverage. (@zverok)
    • Greatly refactor code and make some methods faster, smaller and more readable. (@zverok)
    • Add support for categorical data with different coding schemes and several methods for in built categorical data support. Add a new index 'Daru::CategoricalIndex'. (@lokeshh)
    • Removed runtime dependencies on 'spreadsheet' and 'reportbuilder'. They are now loaded if the libraries are already present in the system. (@v0dro)
  • Minor enhancements
    • Update SqlDataSource to improve the performance of DataFrame.from_sql. (@dansbits)
    • Remove default DataFrame name. Now DataFrames will no name by default. (@zverok)
    • Better looking #inspect for Vector and DataFrame. (@zverok)
    • Better looking #to_html for Vector and DataFrame. Also better #to_html for MultiIndex. (@zverok)
    • Remove monkey patching on Array and add those methods to Daru::ArrayHelper. (@zverok)
    • Add a rake task for running RSpec for every Ruby version with a single command. (@lokeshh)
    • Add rake tasks for easily setting up and testing test harness. (@lokeshh)
    • Added Daru::Vector#to_nmatrix.
    • Remove the 'metadata' feature introduced in v0.1.3. (@gnilrets)
    • Added DataFrame#to_df and Vector#to_df. (@gnilrets)
  • Fixes
    • DataFrame#clone preserves order and name. (@wlevine)
    • Vector#where preserves name. (@v0dro)
    • Fix bug in DataFrame#pivot_table that prevented anything other than Array or Symbol to be specified in the :values option. (@v0dro)
    • Daru::Index#each returns an Enumerator if block is not specified. (@v0dro)
    • Fixes bug where joins failed when nils were in join keys. (@gnilrets)
    • DataFrame#merge now preserves the vector name type when merging. (@lokeshh)
  • Deprecations
    • Remove methods DataFrame#vector and DataFrame#column. (@zverok)
    • Remove the missing_values feature of daru. The only values that are now treated as 'missing' are nil and Float::NAN. (@lokeshh)

Many more code quality and speed enhancements. Lots of bug fixes.

10 May 19:14
Compare
Choose a tag to compare

This release incorporates many many new enhancements and bug fixes from numerous contributors. Some of the salient features of this release are:

  • Sorting is now MUCH faster and can sort data with nil present.
  • Statistics with Missing Data is now supported by all methods.
  • The code now conforms to the standards laid down by Rubocop.
  • Major performance improvements in various methods like join, merge and concat.

For a complete changelog see the HISTORY.md file.

The major contributors for this release were:

Lots of bug fixes and better IO

17 Feb 08:32
Compare
Choose a tag to compare

This release mostly consists of bug fixes or enhancements to various methods from a range of contributors.

Among the new features in this release are:

  • A new method DataFrame.from_activerecord to load data from Ruby on Rails.
  • Better loading of SQL data and abstraction of SQL specific features to Daru::IO::SqlDataSource.
  • Latest development dependencies and a few more optional run time dependencies like bloomfilter-rb.

See the History.md file for a full changelog.

Time series manipulation and arel-like query syntax

19 Aug 17:29
Compare
Choose a tag to compare

This new release brings in lots of new functionality:

  • A new index DateTimeIndex for time series manipulations
  • Many new time series functions for manipulating time series based data
  • Arel-like query syntax
  • Joins and concat
  • Many new methods for various operations
  • Lots of speedups and bug fixes

Complete support for statsample, improved plotting and more functionality.

13 Jun 09:28
Compare
Choose a tag to compare

This release makes daru completely compatible with statsample and statsample-glm for statistical analysis of data by introducing it as a dependency in these gems. Thus you can now use daru data structures in tandem with statsample for statistical analysis.

Apart from this, some salient features of this release are as follows:

  • Many new iterators - map, filter, each, recode, collect and their destructive versions.
  • Much improved wrapper over nyaplot for plotting.
  • Many new statistics functions.
  • More functions to deal with missing data.
  • Loading and writing to many file formats like CSV files, Excel spreadsheets, plain text and SQL databases.
  • Added a new wrapper to wrap over GSL::Vector for super fast computations and optimum storage.
  • Several bug fixes
  • Better documentation and extensive usage examples.