Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

types as C-structs [rebased] #10579

Merged
merged 2 commits into from
Mar 20, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion base/pointer.jl
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ unsafe_store!{T}(p::Ptr{T}, x) = pointerset(p, convert(T,x), 1)
# convert a raw Ptr to an object reference, and vice-versa
unsafe_pointer_to_objref(x::Ptr) = ccall(:jl_value_ptr, Any, (Ptr{Void},), x)
pointer_from_objref(x::ANY) = ccall(:jl_value_ptr, Ptr{Void}, (Any,), x)
data_pointer_from_objref(x::ANY) = pointer_from_objref(x)::Ptr{Void}+Core.sizeof(Int)
data_pointer_from_objref(x::ANY) = pointer_from_objref(x)::Ptr{Void}

eltype{T}(::Type{Ptr{T}}) = T

Expand Down
186 changes: 136 additions & 50 deletions doc/devdocs/object.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,84 +5,170 @@ Memory layout of Julia Objects
Object layout (jl_value_t)
--------------------------

.. sidebar:: `special case. <https:/JuliaLang/julia/blob/master/src/jltypes.c#L2897>`_
The :code:`jl_value_t` struct is the name for a block of memory owned by the Julia Garbage Collector,
representing the data associated with a Julia object in memory.
Absent any type information, it is simply an opaque pointer::

:code:`jl_tuple_type->type = jl_tuple_type`
typedef struct jl_value_t* jl_pvalue_t;

The :code:`jl_value_t` struct defines the minimal header for a Julia
object in memory.
The :code:`type` field points to a
`jl_datatype_t <http:/JuliaLang/julia/blob/master/src/julia.h#L204>`_ object,
(the jl_typeof() macro should be used to query it)::
Each :code:`jl_value_t` struct is contained in a :code:`jl_typetag_t` struct that contains metadata information
about the Julia object, such as its type and garbage-collector (gc) reachability::

typedef struct _jl_value_t {
struct _jl_value_t *type;
} jl_value_t;
typedef struct {
opaque metadata;
jl_value_t value;
} jl_typetag_t;

#define jl_typeof(v) (((jl_value_t*)(v))->type)
The type of any julia object is an instance of a leaf :func:`jl_datatype_t` object.
The :func:`jl_typeof` function can be used to query for it::

jl_value_t *jl_typeof(jl_value_t *v);

The layout of the rest of the object is dependant on its type.
The layout of the object is dependant on its type.
Reflection methods can be used to inspect that layout.
A field can be accessed by calling one of the get-field methods::

e.g. a :func:`Base.tuple` object has an array of pointers to the
objects contained by the tuple::
jl_value_t *jl_get_nth_field_checked(jl_value_t *v, size_t i);
jl_value_t *jl_get_field(jl_value_t *o, char *fld);

typedef struct {
struct _jl_value_t *type;
size_t length;
jl_value_t *data[];
} jl_tuple_t;
If the field types are known, a priori, to be all pointers,
the values can also be extracted directly as an array access::

jl_value_t *v = value->fieldptr[n];

e.g. a "boxed" uint16_t (created by :func:`jl_box_uint16`) is stored as
follows (assuming machine is 64-bit)::
As an example, a "boxed" uint16_t is stored as follows::

struct {
struct _jl_value_t *type; -- 8 bytes
uint16_t data; -- 2 bytes
-- 6 bytes padding
oqaque metadata;
struct {
uint16_t data; -- 2 bytes
} jl_value_t;
};

Structs for the built-in types are `defined in julia.h <http:/JuliaLang/julia/blob/master/src/julia.h#L69>`_. The corresponding global jl_datatype_t objects are created by `jl_init_types() <http:/JuliaLang/julia/blob/master/src/jltypes.c#L2887>`_.
This object is created by :func:`jl_box_uint16`.
Note that the ``jl_value_t*`` pointer references the data portion,
not the metadata at the top of the struct.

A value may be stored "unboxed" in many circumstances
(just the data, without the metadata, and possibly not even stored but just kept in registers),
so it is unsafe to assume that a the address of a box is a unique identifier.
The "egal" test (corresponding to the `is` function in Julia),
should instead be used to compare two unknown objects for equivalence::

Garbage collector mark bit
--------------------------
int jl_egal(jl_value_t *a, jl_value_t *b);

This optimization should be relatively transparant to the API,
since the object will be "boxed" on-demand, whenever a :code:`jl_value_t*` is needed.

Note that modification of a jl_value_t* in memory is permitted only if the object is mutable.
Otherwise, modification of the value may corrupt the program and the result will be undefined.
The mutability property of a value can be queried for with::

int jl_is_mutable(jl_value_t *v);

If the object being stored is a :code:`jl_value_t*`, the Julia garbage-collector must be notified also::

The garbage collector uses the low bit of the :code:`jl_value_t.type`
pointer as a flag to mark reachable objects (see :code:`gcval_t`).
During each mark/sweep cycle, the gc sets the mark bit of each
reachable object, deallocates objects that are not marked, then
clears the mark bits. While the mark/sweep is in progress the
:code:`jl_value_t.type` pointer is altered by the mark bit. The gc
uses the :func:`gc_typeof` macro to retrieve the original type
pointer::
void gc_wb(jl_value_t *parent, jl_value_t *ptr);

#define gc_typeof(v) ((jl_value_t*)(((uptrint_t)jl_typeof(v))&~1UL))
However, the embedding section of the manual is also required reading at this point,
for covering other details of boxing and unboxing various types,
and understanding the gc-interactions.

Mirror structs for some of the built-in types are `defined in julia.h <http:/JuliaLang/julia/blob/master/src/julia.h>`_.
The corresponding global ``jl_datatype_t`` objects are created by `jl_init_types in jltypes.c <http:/JuliaLang/julia/blob/master/src/jltypes.c>`_.

Garbage collector mark bits
---------------------------

The garbage collector uses several bits from the metadata portion of the :code:`jl_typetag_t`
to track each object in the system.
Further details about this algorithm can be found in the comments of the `garbage-collector implementation in gc.c
<http:/JuliaLang/julia/blob/master/src/gc.c>`_.

Object allocation
-----------------

Storage for new objects is allocated by :func:`newobj` in julia_internal.h::
Most new objects are allocated by :func:`jl_new_structv`::

STATIC_INLINE jl_value_t *newobj(jl_value_t *type, size_t nfields)
{
jl_value_t *jv = (jl_value_t*)allocobj((1+nfields) * sizeof(void*));
jv->type = type;
return jv;
}
jl_value_t *jl_new_struct(jl_datatype_t *type, ...);
jl_value_t *jl_new_structv(jl_datatype_t *type, jl_value_t **args, uint32_t na);

.. sidebar:: :ref:`man-singleton-types`
Although, `isbits` objects can be also constructed directly from memory::

jl_value_t *jl_new_bits(jl_value_t *bt, void *data)

And some objects have special constructors that must be used instead of the above functions:

Types::

jl_datatype_t *jl_apply_type(jl_datatype_t *tc, jl_tuple_t *params);
jl_datatype_t *jl_apply_array_type(jl_datatype_t *type, size_t dim);
jl_uniontype_t *jl_new_uniontype(jl_tuple_t *types);

While these are the most commonly used options, there are more low-level constructors too,
which you can find declared in `julia.h <http:/JuliaLang/julia/blob/master/src/julia.h>`_.
These are used in :func:`jl_init_types` to create the initial types needed to bootstrap the creation of the Julia system image.

Tuples::

jl_tuple_t *jl_tuple(size_t n, ...);
jl_tuple_t *jl_tuplev(size_t n, jl_value_t **v);
jl_tuple_t *jl_alloc_tuple(size_t n);

The representation of tuples is highly unique in the Julia object representation ecosystem.
In some cases, a :func:`Base.tuple` object may be an array of pointers to the
objects contained by the tuple equivalent to::

typedef struct {
size_t length;
jl_value_t *data[length];
} jl_tuple_t;

However, in other cases, the tuple may be converted to an anonymous :func:`isbits` type
and stored unboxed, or it may not stored at all (if it is not being used in a generic context as a :code:`jl_value_t*`).

Symbols::

jl_sym_t *jl_symbol(const char *str);

Functions and LambdaStaticData::

jl_function_t *jl_new_generic_function(jl_sym_t *name);
jl_lambda_info_t *jl_new_lambda_info(jl_value_t *ast, jl_tuple_t *sparams);
jl_function_t *jl_new_closure(jl_fptr_t proc, jl_value_t *env, jl_lambda_info_t *li);

Arrays::

jl_array_t *jl_new_array(jl_value_t *atype, jl_tuple_t *dims);
jl_array_t *jl_new_arrayv(jl_value_t *atype, ...);
jl_array_t *jl_alloc_array_1d(jl_value_t *atype, size_t nr);
jl_array_t *jl_alloc_array_2d(jl_value_t *atype, size_t nr, size_t nc);
jl_array_t *jl_alloc_array_3d(jl_value_t *atype, size_t nr, size_t nc, size_t z);
jl_array_t *jl_alloc_cell_1d(size_t n);

Note that many of these have alternative allocation functions for various special-purposes.
The list here reflects the more common usages, but a more complete list can be found by reading the `julia.h header file
<http:/JuliaLang/julia/blob/master/src/julia.h>`_.

Internal to Julia, storage is typically allocated by :func:`newstruct` (or :func:`newobj` for the special types)::

jl_value_t *newstruct(jl_value_t *type);
jl_value_t *newobj(jl_value_t *type, size_t nfields);

And at the lowest level, memory is getting allocated by a call to the garbage collector (in gc.c),
then tagged with its type::

jl_value_t *allocobj(size_t nbytes);
void jl_set_typeof(jl_value_t *v, jl_datatype_t *type);

.. sidebar:: :ref:`man-singleton-types`

Singleton types have only one instance and no data fields.
Singleton instances use only 8 bytes.
Singleton instances have a size of 0 bytes,
and consist only of their metadata.
e.g. :data:`nothing::Void`.

See :ref:`man-singleton-types` and :ref:`man-nothing`

Note that all objects are allocated in multiples of 8 bytes, so the
smallest object size is 16 bytes (8 byte type pointer + 8 bytes
data). :func:`allocobj` in gc.c allocates memory for new objects.
Memory is allocated from a pool for objects up to 2048 bytes, or
by malloc() otherwise.
Note that all objects are allocated in multiples of 4 bytes and aligned to the platform pointer size.
Memory is allocated from a pool for smaller objects, or directly with malloc() for large objects.
Loading