Streamlit tips & tricks
Main function
It's recommended to put all your Streamlit logic into a main
function like this:
def main():
# Render Streamlit app
...
if __name__ == "__main__":
main()
For instance, this makes it easy to have an early return if the app is waiting on user input before displaying the rest of the UI.
import streamlit as st
def main():
uploaded_file = st.file_uploader("Upload your data", type=["xlsx", "csv"])
if not uploaded_file:
return
# rest of the app...
This setup allows you to make main
asynchronous, if necessary (see below).
Async Streamlit
By default, Streamlit is not equipped to deal with asynchronous code. This becomes an issue with some of our more recent utilities, like SharePointClientV2
.
The workaround is a utility available in data_tools.async_streamlit
that turns any asynchronous function into a synchronous one within a Streamlit app:
from data_tools.async_streamlit import make_sync
@make_sync
async def load_some_data():
# do some async stuff ...
def main():
data = load_some_data()
# rest of the app...
if __name__ == "__main__":
main()
If you're using async throughout the page, it's recommended to just make your main
function async
:
from data_tools.async_streamlit import make_sync
async def load_some_data():
...
async def load_more_data():
...
@make_sync
async def main():
data, more_data = await asyncio.gather(load_some_data(), load_more_data)
# rest of the app...
if __name__ == "__main__":
main()
@st_session_cache
Streamlit offers two built-in cache decorators: @st.cache_resource
and @st.cache_data
. While those are useful, most of the time, we want to cache on the session-level, to avoid refetching data multiple times when a user interacts with the page, but allowing the data to refresh if the user reloads the page.
To make this easier, we have @st_session_cache
decorator that works the same way as @st.cache_data
: for the same arguments, it will run once and return the same result for the duration of the session.
This decorator can also be combined with @make_sync
described above. Here's an example that incorporates both:
from data_tools.async_streamlit import make_sync
from data_tools.utils import session_cache
@st_session_cache
@make_sync
async def load_data_batch(*tables_or_queries: str):
dataframes = []
# load data
return data frames
def main():
dfs = load_data_batch(
"analytical.xrf_simplified",
"analytical.xrd_simplified",
"analytical.psa_statistics",
"leaching.leach_extraction_xrf"
)
Caveats:
- Arguments are compared either by their hash (if they are hashable) or their identity, using the built-in
hash
andid
functions.- If a list, dictionary, or DataFrame is recreated on every render, it will be considered different, even if its contents are identical.
@st.cache_resource
and@st.cache_data
also cache any Streamlit elements rendered within their function. This is not the case for@st_session_cache
.
session_variable
A really simple utility to get a variable from session state, or initialize it if it's not in there yet. You need to pass in a function for the initialization. This is especially useful to keep a single collection (like a list
or dict
) across renders within a session.
The implementation:
# data-tools/data_tols/utils.py
def session_variable(key: str, init):
if key not in st.session_state:
st.session_state[key] = init()
return st.session_state[key]