Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

describe bed coordinates as zero-based half-open intervals #1070

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/content/general-usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,9 @@ The file description below is modified from: http://genome.ucsc.edu/FAQ/FAQforma
- *The start position in each BED feature is therefore interpreted to be 1 greater than the start position listed in the feature. For example, start=9, end=20 is interpreted to span bases 10 through 20,inclusive*.
- *This column is required*.

3. **end** - The one-based ending position of the feature in the chromosome.
3. **end** - The zero-based exclusive ending position of the feature in the chromosome.

- *The end position in each BED feature is one-based. See example above*.
- *The end position in each BED feature is the 0-based coordinate of the next base not in the interval*.
- *This column is required*.

4. **name** - Defines the name of the BED feature.
Expand Down
9 changes: 4 additions & 5 deletions docs/content/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -133,15 +133,14 @@ And 2) For tools where only one input feature file is needed, the “-i” optio
bedtools merge –i repeats.bed

-----------------------------------------------------
BED starts are zero-based and BED ends are one-based.
BED uses zero-based half-open intervals
-----------------------------------------------------
bedtools users are sometimes confused by the way the start and end of BED features are represented. Specifically, bedtools uses the UCSC Genome Browser’s internal database convention of making the start position 0-based and the end position 1-based: (http://genome.ucsc.edu/FAQ/FAQtracks#tracks1)
In other words, bedtools interprets the “start” column as being 1 basepair higher than what is represented in the file. For example, the following BED feature represents a single base on chromosome 1; namely, the 1st base::
bedtools users are sometimes confused by the way the start and end of BED features are represented. Specifically, bedtools uses the UCSC Genome Browser’s internal database convention of making the start and end position 0-based (http://genome.ucsc.edu/FAQ/FAQtracks#tracks1).
For example, the following BED feature represents a single base on chromosome 1; namely, the first base::

chr1 0 1 first_base

Why, you might ask? The advantage of storing features this way is that when computing the length of a feature, one must simply subtract the start from the end. Were the start position 1-based,
the calculation would be (slightly) more complex (i.e. (end-start)+1). Thus, storing BED features this way reduces the computational burden.
Note that the end position is exclusive and the range represents only a single base. This is equivalent to the mathematical notion of using half-open intervals [0,1).

-----------------------------------------------------
GFF starts and ends are one-based.
Expand Down