Skip to content

Commit

Permalink
link to docs site
Browse files Browse the repository at this point in the history
  • Loading branch information
sheppard committed Jun 6, 2023
1 parent 3f8f2bb commit c996d18
Show file tree
Hide file tree
Showing 8 changed files with 37 additions and 669 deletions.
11 changes: 6 additions & 5 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@ Thanks for contributing to IterTable! Here are some guidelines to help you get

## Questions

Feel free to use the issue tracker to ask questions! We don't currently have a separate mailing list or active chat tool.
Questions and ideas can be submitted to the [Django Data Wizard discussion board](https:/wq/django-data-wizard/discussions).

## Bug Reports

Bug reports can take any form as long as there is enough information to diagnose the problem. To speed up response time, try to include the following whenever possible:
Bug reports can be submitted to either [IterTable issues](https:/wq/itertable/issues) or [Django Data Wizard issues](https:/wq/itertable/issues). Reports can take any form as long as there is enough information to diagnose the problem. To speed up response time, try to include the following whenever possible:
* Versions of Fiona and/or Pandas, if applicable
* Expected (or ideal) behavior
* Actual behavior
Expand All @@ -18,9 +18,10 @@ Bug reports can take any form as long as there is enough information to diagnose
Pull requests are very welcome and will be reviewed and merged as time allows. To speed up reviews, try to include the following whenever possible:
* Reference the issue that the PR fixes (e.g. [#3](https:/wq/itertable/issues/3))
* Failing test case fixed by the PR
* If the PR provides new functionality, update [the documentation](https:/wq/itertable/blob/master/docs/)
* If the PR provides new functionality, update [the documentation](https:/wq/django-data-wizard/tree/main/docs/itertable)
* Ensure the PR passes lint and unit tests. This happens automatically, but you can also run these locally with the following commands:

```bash
./runtests.sh # run the test suite
LINT=1 ./runtests.sh # run code style checking
python -m unittest discover -s tests -t . -v # run the test suite
flake8 # run code style checking
```
156 changes: 31 additions & 125 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,128 +17,34 @@ for row in load_file("example.xlsx"):
[![Tests](https:/wq/itertable/actions/workflows/test.yml/badge.svg)](https:/wq/itertable/actions/workflows/test.yml)
[![Python Support](https://img.shields.io/pypi/pyversions/itertable.svg)](https://pypi.python.org/pypi/itertable)

> **Note:** Prior to version 2.0, IterTable was **wq.io**, a submodule of the [wq framework]. The package has been renamed to avoid confusion with the wq framework website (<https://wq.io>).
Similarly, IterTable's `*IO` classes have been renamed to `*Iter`, as the API is not intended to match that of Python's `StringIO` or other `io` classes.

```diff
- from wq.io import CsvFileIO
- data = CsvFileIO(filename='data.csv')
+ from itertable import CsvFileIter
+ data = CsvFileIter(filename='data.csv')
```

## Getting Started

```bash
# Recommended: create virtual environment
# python3 -m venv venv
# . venv/bin/activate

python3 -m pip install itertable

# GIS support (Fiona & Shapely)
python3 -m pip install itertable[gis]

# Excel 97-2003 (.xls) support
python3 -m pip install itertable[oldexcel]
# (xlsx support is enabled by default)

# Pandas integration
python3 -m pip install itertable[pandas]
```

## Overview

IterTable provides a general purpose API for loading, iterating over, and writing tabular datasets. The goal is to avoid needing to remember the unique usage of e.g. [csv], [openpyxl], or [xml.etree] every time one needs to work with external data. Instead, IterTable abstracts these libraries into a consistent interface that works as an [iterable] of [namedtuples]. Whenever possible, the field names for a dataset are automatically determined from the source file, e.g. the column headers in an Excel spreadsheet.

```python
from itertable import ExcelFileIter
data = ExcelFileIter(filename='example.xlsx')
for row in data:
print(row.name, row.date)
```

IterTable provides a number of built-in classes like the above, including a `CsvFileIter`, `XmlFileIter`, and `JsonFileIter`. There is also a convenience function, `load_file()`, that attempts to automatically determine which class to use for a given file.

```python
from itertable import load_file
data = load_file('example.csv')
for row in data:
print(row.name, row.date)
```

All of the included `*FileIter` classes support both reading and writing to external files.

### Network Client

IterTable also provides network-capable equivalents of each of the above classes, to facilitate loading data from third party webservices.

```python
from itertable import JsonNetIter
class WebServiceIter(JsonNetIter):
url = "http://example.com/api"

data = WebServiceIter(params={'type': 'all'})
for row in data:
print(row.timestamp, row.value)
```

The powerful [requests] library is used internally to load data over HTTP.

### Pandas Analysis

When [Pandas] is installed (via `itertable[pandas]`), the `as_dataframe()` method on itertable classes can be used to create a [DataFrame], enabling more extensive analysis possibilities.

```python
instance = WebServiceIter(params={'type': 'all'})
df = instance.as_dataframe()
print(df.value.mean())
```

### GIS Support

When [Fiona] and [Shapely] are installed (via `itertable[gis]`), itertable can also open and create shapefiles and other OGR-compatible geographic data formats.

```python
from itertable import ShapeIter
data = ShapeIter(filename='sites.shp')
for id, site in data.items():
print(id, site.geometry.wkt)
```

More information on IterTable's gis support is available [here][gis].

### Command-Line Interface

IterTable provides a simple CLI for rendering the content of a file or Iter class. This can be useful for e.g. inspecting a file or for integrating a shell automation workflow. The default output is CSV, but can be changed to JSON by setting `-f json`.

```bash
python3 -m itertable example.json # JSON to CSV
python3 -m itertable -f json example.csv # CSV to JSON
python3 -m itertable example.xlsx "start_row=5"
python3 -m itertable http://example.com/example.csv
python3 -m itertable itertable.CsvNetIter "url=http://example.com/example.csv"
```

### Extending IterTable

It is straightforward to [extend IterTable][custom] to support arbitrary formats. Each provided class is composed of a [BaseIter][base] class and mixin classes ([loaders], [parsers], and [mappers]) that handle the various steps of the process.

[wq framework]: https://wq.io/
[csv]: https://docs.python.org/3/library/csv.html
[openpyxl]: https://openpyxl.readthedocs.io/en/stable/
[xml.etree]: https://docs.python.org/3/library/xml.etree.elementtree.html
[iterable]: https://docs.python.org/3/glossary.html#term-iterable
[namedtuples]: https://docs.python.org/3/library/collections.html#collections.namedtuple
[requests]: http://python-requests.org/
[Pandas]: http://pandas.pydata.org/
[DataFrame]: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html
[Fiona]: https:/Toblerity/Fiona
[Shapely]: https:/Toblerity/Shapely

[custom]: https:/wq/itertable/blob/master/docs/about.md
[base]: https:/wq/itertable/blob/master/docs/base.md
[loaders]: https:/wq/itertable/blob/master/docs/loaders.md
[parsers]: https:/wq/itertable/blob/master/docs/parsers.md
[mappers]: https:/wq/itertable/blob/master/docs/mappers.md
[gis]: https:/wq/itertable/blob/master/docs/gis.md
### [Documentation][docs]

[**Installation**][installation]

[**API**][api]
<br>
[CLI][cli]
&bull;
[GIS][gis]

[**Extending IterTable**][custom]
<br>
[BaseIter][base]
&bull;
[Loaders][loaders]
&bull;
[Parsers][parsers]
&bull;
[Mappers][mappers]

[docs]: https://django-data-wizard.wq.io/itertable/

[installation]: https://django-data-wizard.wq.io/itertable/#getting-started
[api]: https://django-data-wizard.wq.io/itertable/#overview
[cli]: https://django-data-wizard.wq.io/itertable/#command-line-interface
[custom]: https://django-data-wizard.wq.io/itertable/custom
[base]: https://django-data-wizard.wq.io/itertable/base
[loaders]: https://django-data-wizard.wq.io/itertable/loaders
[parsers]: https://django-data-wizard.wq.io/itertable/parsers
[mappers]: https://django-data-wizard.wq.io/itertable/mappers
[gis]: https://django-data-wizard.wq.io/itertable/gis
68 changes: 0 additions & 68 deletions docs/about.md

This file was deleted.

92 changes: 0 additions & 92 deletions docs/base.md

This file was deleted.

Loading

0 comments on commit c996d18

Please sign in to comment.