2023-03-14

This is Maëlle's DBGI daily open-notebook.

Today is 2023.03.14

TODO

Check discussion: https://github.com/orgs/digital-botanical-gardens-initiative/discussions
Check Zulip
try searching bar instead of dropdown
have a look at databricks
frictionless https://frictionlessdata.io/

CODE

NOTES

openBIS workshop

⚠️ discuss backup of our postgres with pma

databricks

https://hevodata.com/learn/what-is-databricks/

TODO NEXT

Important for redaction

From ChatGPT for search bar:

If you have a large DataFrame and want to optimize the performance of your search function, there are a few things you can do:

Use vectorized operations instead of loops: In general, it's faster to use vectorized operations on a pandas DataFrame than to iterate over each row or column. For example, you can use the apply() method to apply a function to each element of a DataFrame column, or the str.contains() method to search for a pattern in a DataFrame column.

Use regular expressions carefully: Regular expressions can be powerful, but they can also be slow if the pattern is complex or the data is large. If possible, use simpler patterns that can be matched efficiently.

Use indexing: If you're searching for matches in a DataFrame column, you can use indexing to speed up the search. For example, if you know that a column is sorted, you can use binary search to quickly find the range of rows that match your search pattern.

Use parallel processing: If your machine has multiple cores, you can use parallel processing to speed up the search. For example, you can split the DataFrame into chunks and process each chunk in parallel using the multiprocessing module.