Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
R
Root
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Admin message
为了安全,强烈建议开启2FA双因子认证:User Settings -> Account -> Enable two-factor authentication!!!
Show more breadcrumbs
cxwx
Root
Commits
8a7808b7
Commit
8a7808b7
authored
7 years ago
by
Enric Tejedor Saavedra
Committed by
Danilo Piparo
7 years ago
Browse files
Options
Downloads
Patches
Plain Diff
Add some Doxygen documentation for CSV DS
parent
eff91a29
No related branches found
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
tree/treeplayer/src/TCsvDS.cxx
+87
-0
87 additions, 0 deletions
tree/treeplayer/src/TCsvDS.cxx
with
87 additions
and
0 deletions
tree/treeplayer/src/TCsvDS.cxx
+
87
−
0
View file @
8a7808b7
// Author: Enric Tejedor CERN 10/2017
/*************************************************************************
* Copyright (C) 1995-2017, Rene Brun and Fons Rademakers. *
* All rights reserved. *
* *
* For the licensing terms see $ROOTSYS/LICENSE. *
* For the list of contributors see $ROOTSYS/README/CREDITS. *
*************************************************************************/
// clang-format off
/** \class ROOT::Experimental::TDF::TCsvDS
\ingroup dataframe
\brief TDataFrame data source class for reading CSV files.
The TCsvDS class implements a CSV file reader for TDataFrame.
A TDataFrame that reads from a CSV file can be constructed using the factory method
ROOT::Experimental::TDF::MakeCsvDataFrame, which accepts three parameters:
1. Path to the CSV file.
2. Boolean that specifies whether the first row of the CSV file contains headers or
not (optional, default `true`). If `false`, header names will be automatically generated.
3. Delimiter (optional, default ',').
The types of the columns in the CSV file are automatically inferred. The supported
types are:
- Integer: stored as a 64-bit long long int.
- Floating point number: stored with double precision.
- Boolean: matches the literals `true` and `false`.
- String: stored as an std::string, matches anything that does not fall into any of the
previous types.
These are some formatting rules expected by the TCsvDS implementation:
- All records must have the same number of fields, in the same order.
- Any field may be quoted.
~~~
"1997","Ford","E350"
~~~
- Fields with embedded delimiters (e.g. comma) must be quoted.
~~~
1997,Ford,E350,"Super, luxurious truck"
~~~
- Fields with double-quote characters must be quoted, and each of the embedded
double-quote characters must be represented by a pair of double-quote characters.
~~~
1997,Ford,E350,"Super, ""luxurious"" truck"
~~~
- Fields with embedded line breaks are not supported, even when quoted.
~~~
1997,Ford,E350,"Go get one now
they are going fast"
~~~
- Spaces are considered part of a field and are not ignored.
~~~
1997, Ford , E350
not same as
1997,Ford,E350
but same as
1997, "Ford" , E350
~~~
- If a header row is provided, it must contain column names for each of the fields.
~~~
Year,Make,Model
1997,Ford,E350
2000,Mercury,Cougar
~~~
The current implementation of TCsvDS reads the entire CSV file content into memory before
TDataFrame starts processing it. Therefore, before creating a CSV TDataFrame, it is
important to check both how much memory is available and the size of the CSV file.
*/
// clang-format on
#include
<ROOT/RMakeUnique.hxx>
#include
<ROOT/TCsvDS.hxx>
#include
<ROOT/TDFUtils.hxx>
...
...
@@ -135,6 +208,12 @@ size_t TCsvDS::ParseValue(const std::string &line, std::vector<std::string> &col
return
i
;
}
////////////////////////////////////////////////////////////////////////
/// Constructor to create a CSV TDataSource for TDataFrame.
/// \param[in] fileName Path of the CSV file.
/// \param[in] readHeaders `true` if the CSV file contains headers as first row, `false` otherwise
/// (default `true`).
/// \param[in] delimiter Delimiter character (default ',').
TCsvDS
::
TCsvDS
(
std
::
string_view
fileName
,
bool
readHeaders
,
char
delimiter
)
// TODO: Let users specify types?
:
fFileName
(
fileName
),
fDelimiter
(
delimiter
)
...
...
@@ -172,6 +251,8 @@ TCsvDS::TCsvDS(std::string_view fileName, bool readHeaders, char delimiter) // T
}
}
////////////////////////////////////////////////////////////////////////
/// Destructor.
TCsvDS
::~
TCsvDS
()
{
for
(
auto
&
record
:
fRecords
)
{
...
...
@@ -253,6 +334,12 @@ void TCsvDS::SetNSlots(unsigned int nSlots)
fEntryRanges
.
back
().
second
+=
remainder
;
}
////////////////////////////////////////////////////////////////////////
/// Factory method to create a CSV TDataFrame.
/// \param[in] fileName Path of the CSV file.
/// \param[in] readHeaders `true` if the CSV file contains headers as first row, `false` otherwise
/// (default `true`).
/// \param[in] delimiter Delimiter character (default ',').
TDataFrame
MakeCsvDataFrame
(
std
::
string_view
fileName
,
bool
readHeaders
,
char
delimiter
)
{
ROOT
::
Experimental
::
TDataFrame
tdf
(
std
::
make_unique
<
TCsvDS
>
(
fileName
,
readHeaders
,
delimiter
));
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment