Spaces:
Sleeping
Sleeping
File size: 4,799 Bytes
3ddfd7a ab59d33 3ddfd7a b508eb9 64d50ce ab59d33 3ddfd7a ab59d33 51d9500 ab59d33 b508eb9 ab59d33 b508eb9 ab59d33 b508eb9 ab59d33 b508eb9 3ddfd7a b508eb9 ab59d33 7ded404 ab59d33 7ded404 ab59d33 7ded404 ab59d33 7ded404 ab59d33 b508eb9 7ded404 b508eb9 ab59d33 7ded404 ab59d33 7ded404 b508eb9 ab59d33 b508eb9 51d9500 ab59d33 7ded404 ab59d33 7ded404 ab59d33 7ded404 51d9500 ab59d33 51d9500 7ded404 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 |
import streamlit as st
import pandas as pd
import altair as alt
st.title('UFO Data Visualization Analysis Report')
@st.cache_data
def load_data():
columns = [
'datetime', 'city', 'state', 'country', 'shape',
'duration_seconds', 'duration_reported', 'description',
'date_posted', 'latitude', 'longitude'
]
df = pd.read_csv(
'https://github.com/UIUC-iSchool-DataViz/is445_data/raw/main/ufo-scrubbed-geocoded-time-standardized-00.csv',
header=None,
names=columns
)
df['datetime'] = pd.to_datetime(df['datetime'])
df['year'] = df['datetime'].dt.year
df['month'] = df['datetime'].dt.month
df['shape'] = df['shape'].fillna('unknown').str.lower()
return df
df = load_data()
st.markdown("## 1. Temporal Trends in UFO Sightings")
min_year = int(df['year'].min())
max_year = int(df['year'].max())
year_range = st.slider(
"Select Year Range",
min_value=min_year,
max_value=max_year,
value=(min_year, max_year)
)
time_agg = st.selectbox(
"Select Time Aggregation",
['Yearly', 'Monthly']
)
if time_agg == 'Yearly':
time_counts = df[df['year'].between(year_range[0], year_range[1])]\
.groupby('year').size().reset_index(name='count')
x_encoding = alt.X('year:Q', title='Year')
else:
df['date'] = pd.to_datetime(df[['year', 'month']].assign(day=1))
time_counts = df[df['year'].between(year_range[0], year_range[1])]\
.groupby('date').size().reset_index(name='count')
x_encoding = alt.X('date:T', title='Date')
base_chart = alt.Chart(time_counts).encode(
x=x_encoding,
y=alt.Y('count:Q', title='Number of Sightings'),
tooltip=['count:Q']
)
temporal_viz = (
base_chart.mark_area(opacity=0.3) +
base_chart.mark_line(color='steelblue') +
base_chart.mark_point(color='steelblue')
).properties(
width=700,
height=400
).interactive()
st.altair_chart(temporal_viz)
st.markdown("""**(1) Features Highlighted**
This visualization emphasizes the temporal evolution of UFO sightings from 1949 to 2013, highlighting both the overall trend and year-specific fluctuations in reporting frequency. The visualization reveals distinct patterns of increased reporting over time, with notable spikes in certain periods that could correlate with significant historical events or changes in reporting methods.
**(2) Design Choices**
I implemented several key design elements for optimal data representation:
+ A line chart with point markers was chosen for its effectiveness in showing continuous time-series data while maintaining precise year-specific values
+ Interactive tooltips were added to provide exact sighting counts
**(3) Potential Improvements**
Given more time, I would implement dual y-axes to show both sighting frequency and duration, and add the capability to filter by time periods and incorporate monthly/seasonal analysis options.""")
st.markdown("## 2. Analysis of UFO Shape Distribution")
all_shapes = sorted(df['shape'].unique())
selected_shapes = st.multiselect(
"Select UFO Shapes to Display",
options=all_shapes,
default=all_shapes[:10]
)
filtered_df = df[df['shape'].isin(selected_shapes)]
shape_counts = filtered_df['shape'].value_counts().reset_index()
shape_counts.columns = ['shape', 'count']
shape_viz = alt.Chart(shape_counts).mark_bar().encode(
y=alt.Y('shape:N',
sort='-x',
title='UFO Shape'),
x=alt.X('count:Q',
title='Number of Reports'),
color=alt.Color('count:Q', scale=alt.Scale(scheme='viridis')),
tooltip=['shape:N', 'count:Q']
).properties(
width=700,
height=max(len(selected_shapes) * 25, 400)
).interactive()
st.altair_chart(shape_viz)
st.markdown("""**(1) Features Highlighted**
This visualization focuses on the distribution of reported UFO shapes. The data shows clear preferences in how witnesses describe UFO shapes, with certain forms being consistently more common across reports. This analysis helps identify patterns in how people perceive and describe unidentified flying objects.
**(2) Design Choices**
The visualization employs several intentional design elements:
+ A horizontal bar chart format was chosen to accommodate long shape labels and enable easy comparison of quantities
+ Bars are sorted in descending order to immediately highlight the most common shapes
+ The chart focuses on the top 10 shapes to maintain clarity and prevent information overload
+ Interactive tooltips provide precise counts for each shape category.
**(3) Potential Improvements**
With additional time, I would expand this visualization by adding temporal analysis to show how shape distributions have changed over decades, incorporate geographical analysis to reveal regional patterns in shape reporting.""") |