geonature.core.imports.utils

Attributes

Classes

ImportStep

Enum where members are also (and must be) ints

Functions

clean_import(→ None)

Clean an import at a specific step.

get_file_size(→ int)

Get the size of a file in bytes.

detect_encoding(→ str)

Detects the encoding of a file.

detect_separator(→ Optional[str])

Detects the delimiter used in a CSV file.

preprocess_value(→ pandas.Series)

Preprocesses values in a DataFrame depending if the field contains multiple values (e.g. additional_data) or not.

insert_import_data_in_transient_table(→ int)

Insert the data from the import file into the transient table.

build_fieldmapping(→ Tuple[Dict[str, Dict[str, Any]], ...)

Build a dictionary that maps the source column names to the corresponding field and values.

load_transient_data_in_dataframe(imprt, entity, ...[, ...])

Load data from the transient table into a pandas dataframe.

update_transient_data_from_dataframe(imprt, entity, ...)

Update the transient table with the data from the dataframe.

generate_pdf_from_template(→ bytes)

Generate a PDF document from a template.

get_mapping_data(import_, entity)

Get the mapping data for a given import and entity.

get_required(import_, entity)

compute_bounding_box(imprt, geom_entity_code, ...[, ...])

Compute the bounding box of an entity with a geometry in the given import, based on its

Module Contents

class geonature.core.imports.utils.ImportStep[source]

Bases: enum.IntEnum

Enum where members are also (and must be) ints

UPLOAD = 1[source]
DECODE = 2[source]
LOAD = 3[source]
PREPARE = 4[source]
IMPORT = 5[source]
geonature.core.imports.utils.generated_fields[source]
geonature.core.imports.utils.clean_import(imprt: geonature.core.imports.models.TImports, step: ImportStep) None[source]

Clean an import at a specific step.

Parameters

imprtTImports

The import to clean.

stepImportStep

The step at which to clean the import.

geonature.core.imports.utils.get_file_size(file_: IO) int[source]

Get the size of a file in bytes.

Parameters

file_IO

The file to get the size of.

Returns

int

The size of the file in bytes.

geonature.core.imports.utils.detect_encoding(file_: IO) str[source]

Detects the encoding of a file.

Parameters

file_IO

The file to detect the encoding of.

Returns

str

The detected encoding. If no encoding is detected, then « UTF-8 » is returned.

geonature.core.imports.utils.detect_separator(file_: IO, encoding: str) str | None[source]

Detects the delimiter used in a CSV file.

Parameters

file_IO

The file object to detect the delimiter of.

encodingstr

The encoding of the file.

Returns

Optional[str]

The delimiter used in the file, or None if no delimiter is detected.

Raises

BadRequest

If the file starts with no column names.

geonature.core.imports.utils.preprocess_value(dataframe: pandas.DataFrame, field: geonature.core.imports.models.BibFields, source_col: str) pandas.Series[source]

Preprocesses values in a DataFrame depending if the field contains multiple values (e.g. additional_data) or not.

Parameters

dataframepd.DataFrame

The DataFrame to preprocess the value of.

fieldBibFields

The field to preprocess.

source_colstr

The column to preprocess.

Returns

pd.Series

The preprocessed value.

geonature.core.imports.utils.insert_import_data_in_transient_table(imprt: geonature.core.imports.models.TImports) int[source]

Insert the data from the import file into the transient table.

Parameters

imprtTImports

current import

Returns

int

The last line number of the import file that was inserted.

geonature.core.imports.utils.build_fieldmapping(imprt: geonature.core.imports.models.TImports, columns: Iterable[Any]) Tuple[Dict[str, Dict[str, Any]], List[str]][source]

Build a dictionary that maps the source column names to the corresponding field and values.

Parameters

imprtTImports

The import to check.

columnsIterable[Any]

The columns to map.

Returns

tuple

A tuple containing a dictionary that maps the source column names to the corresponding field and values, and a list of the used columns.

geonature.core.imports.utils.load_transient_data_in_dataframe(imprt: geonature.core.imports.models.TImports, entity: geonature.core.imports.models.Entity, source_cols: list, offset: int = None, limit: int = None)[source]

Load data from the transient table into a pandas dataframe.

Parameters

imprtTImports

The import to load.

entityEntity

The entity to load.

source_colslist

The columns to load from the transient table.

offsetint, optional

The number of rows to skip.

limitint, optional

The maximum number of rows to load.

Returns

pandas.DataFrame

The dataframe containing the loaded data.

geonature.core.imports.utils.update_transient_data_from_dataframe(imprt: geonature.core.imports.models.TImports, entity: geonature.core.imports.models.Entity, updated_cols: Set[str], dataframe: pandas.DataFrame)[source]

Update the transient table with the data from the dataframe.

Parameters

imprtTImports

The import to update.

entityEntity

The entity to update.

updated_colslist

The columns to update.

dfpandas.DataFrame

The dataframe to use for the update.

Notes

The dataframe must have the columns “id_import” and “line_no”.

geonature.core.imports.utils.generate_pdf_from_template(template: str, data: Any) bytes[source]

Generate a PDF document from a template.

Parameters

templatestr

The name of the template file to use.

dataAny

The data to pass to the template.

Returns

bytes

The PDF document as bytes.

geonature.core.imports.utils.get_mapping_data(import_: geonature.core.imports.models.TImports, entity: geonature.core.imports.models.Entity)[source]

Get the mapping data for a given import and entity.

Parameters

import_TImports

The import to get the mapping data for.

entityEntity

The entity to get the mapping data for.

Returns

fieldsdict

A dictionary with the all fields associated with an entity (check gn_imports.bib_fields). This dictionary is keyed by the name field and valued by the corresponding BibField object.

selected_fieldsdict

In the same format as fields, but only the fields contained in the mapping.

source_colslist

List of fields to load in dataframe, mainly source column of non-nomenclature fields

geonature.core.imports.utils.get_required(import_: geonature.core.imports.models.TImports, entity: geonature.core.imports.models.Entity)[source]
geonature.core.imports.utils.compute_bounding_box(imprt: geonature.core.imports.models.TImports, geom_entity_code: str, geom_4326_field_name: str, *, child_entity_code: str = None, transient_where_clause=None, destination_where_clause=None)[source]

Compute the bounding box of an entity with a geometry in the given import, based on its entities tree (e.g. Station -> Habitat; Site -> Visite -> Observation).

Parameters

imprtTImports

The import to get the bounding box of.

geom_entity_codestr

The code of the entity that contains the geometry.

geom_4326_field_namestr

The name of the column in the geom entity table that contains the geometry.

child_entity_codestr, optional

The code of the last child entity (of the geom entity) to consider when computing the bounding box. If not given, bounding-box will be computed only on the geom entity.

transient_where_clausesqlalchemy.sql.elements.BooleanClauseList, optional

A where clause to apply to the query when computing the bounding box of a processed import.

destination_where_clausesqlalchemy.sql.elements.BooleanClauseList, optional

A where clause to apply to the query when computing the bounding box of a finished import.

Returns

valid_bboxdict

The bounding box of all entities in the given import, in GeoJSON format.