-
Notifications
You must be signed in to change notification settings - Fork 76
Open
Labels
enhancementNew feature or requestNew feature or request
Milestone
Description
We could consider adding a DSL similar to buildList {} but buildDataFrame {} where rows can be yielded one at a time.
This will be more performant than:
var df = DataFrame.EMPTY
for (row in data) {
df = df.append(row....)
}It could use something like our ColumnDataCollector under-the-hood but one data collector for each column the user wants to be in the dataframe.
Ideas for the DSL:
- append-like
val df = buildDataFrame("a", "b", "c") {
for (row in data) {
if (row.isSomething) continue
append(row.myA, row.myB, row.myC)
}
}or
val df = buildDataFrame {
for (row in data) {
if (row.isSomething) continue
append("a" to row.myA, "b" to row.myB, "c" to row.myC)
}
}- map-like
val df = buildDataFrame {
for (row in data) {
if (row.isSomething) continue
add("a", row.myA) // or put?
this["b"] = row.myB
put("c", row.myC) // or add?
}
}toDataFrame/add-like
val df = buildDataFrame {
for (row in data) {
if (row.isSomething) continue
"a" from { row.myA }
"b" from row.myB // {} are not needed here because we have a single row
row.myC into "c"
}
}(This should not be confused with the column-based DynamicDataFrameBuilder)
Obligatory mention: We can already do something similar like this is the latest -dev version, though it creates n DataFrames under the hood:
val df = buildList {
for (row in data) {
if (row.isSomething) continue
this += mapOf("a" to row.myA, "b" to row.myB, "c" to row.myC).toDataRow()
}
}.concat()Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request