Generate streamlined analysis workfile (CSV) from SCIO-DB #46

mjherzog · 2020-11-12T02:51:37Z

A streamlined CSV workfile will be very useful for SCA planning. The columns we need are listed below by SCTK runtime option using current SCTK CSV output column names. If there are multiple values in a JSON field we want all of the values in one cell ("flattened").

Info:
Resource
type
name
base_name
extension
size
sha1
mime_type
file_type
programming_language

Copyrights:
copyright
copyright_holder
author

Licenses:
license_expression
license__key
license__score
license__category
license__owner

email
url

Packages:
package__type
package__namespace
package__name
package__version
package__primary_language
package__description
package__release_date
package__homepage_url
package__download_url
package__size
package__sha1
package__vcs_url
package__copyright
package__license_expression
package__declared_license
package__notice_text

steven-esser · 2020-11-12T15:55:37Z

@JonoYang has written this utility: https:/nexB/spats/blob/develop/src/spats/scanpipe_results_to_xlsx.py that "flattens" the data how we normally expect it (\n characters separating multiple values) and supports package data.

@mjherzog I believe we want to support package data as well? This is on our normal work file alongside regular scancode detections.

mjherzog · 2020-11-12T17:54:08Z

@MaJuRG My mistake to forget about packages. I updated the main text above with selected package fields.

mjherzog · 2020-11-12T17:57:34Z

The list of fields requested is my take on what is most useful for planning and simpler analysis tasks. My working assumption is that there will be other CSV workfile output variations.

Signed-off-by: Thomas Druez <[email protected]>

- Exclude the following "technical" fields: "licenses", "extra_data", "declared_license" Signed-off-by: Thomas Druez <[email protected]>

Signed-off-by: Thomas Druez <[email protected]>

tdruez · 2023-07-10T12:29:56Z

Closing as the existing XLSX output should be used instead.

mjherzog assigned tdruez Nov 12, 2020

tdruez added a commit that referenced this issue Dec 11, 2020

Add CSV support for the output management command #46

1f75a82

Signed-off-by: Thomas Druez <[email protected]>

tdruez added a commit that referenced this issue Dec 11, 2020

Add a to_xlsx output pipe #46

7e158fa

Signed-off-by: Thomas Druez <[email protected]>

tdruez added a commit that referenced this issue Dec 11, 2020

Force all xlsx output values as strings #46

48f5572

Signed-off-by: Thomas Druez <[email protected]>

tdruez added a commit that referenced this issue Dec 11, 2020

Refine the formatting of the xlsx output #46

da44359

- Exclude the following "technical" fields: "licenses", "extra_data", "declared_license" Signed-off-by: Thomas Druez <[email protected]>

tdruez added a commit that referenced this issue Dec 14, 2020

Fix failing unit test #46

2e646c0

Signed-off-by: Thomas Druez <[email protected]>

tdruez added a commit that referenced this issue Dec 22, 2020

Add simple test for the to_xlsx output pipe #46

c394d9c

Signed-off-by: Thomas Druez <[email protected]>

tdruez added a commit that referenced this issue Dec 22, 2020

Add CHANGELOG entry #46

3331ecc

Signed-off-by: Thomas Druez <[email protected]>

tdruez added a commit that referenced this issue Dec 22, 2020

Add a to_xlsx output pipe returning XLSX compatible content #46

4f319a1

Signed-off-by: Thomas Druez <[email protected]>

tdruez closed this as completed Jul 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate streamlined analysis workfile (CSV) from SCIO-DB #46

Generate streamlined analysis workfile (CSV) from SCIO-DB #46

mjherzog commented Nov 12, 2020 •

edited

Loading

steven-esser commented Nov 12, 2020

mjherzog commented Nov 12, 2020

mjherzog commented Nov 12, 2020

tdruez commented Jul 10, 2023

Generate streamlined analysis workfile (CSV) from SCIO-DB #46

Generate streamlined analysis workfile (CSV) from SCIO-DB #46

Comments

mjherzog commented Nov 12, 2020 • edited Loading

steven-esser commented Nov 12, 2020

mjherzog commented Nov 12, 2020

mjherzog commented Nov 12, 2020

tdruez commented Jul 10, 2023

mjherzog commented Nov 12, 2020 •

edited

Loading