hyzhang00 commited on
Commit
b508eb9
·
verified ·
1 Parent(s): ab59d33

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +89 -35
app.py CHANGED
@@ -2,10 +2,8 @@ import streamlit as st
2
  import pandas as pd
3
  import altair as alt
4
 
5
- # Set page title
6
- st.title(' UFO Data Visualization Analysis Report')
7
 
8
- # Load and process data
9
  @st.cache_data
10
  def load_data():
11
  columns = [
@@ -19,7 +17,6 @@ def load_data():
19
  header=None,
20
  names=columns
21
  )
22
- # print(df)
23
 
24
  df['datetime'] = pd.to_datetime(df['datetime'])
25
  df['year'] = df['datetime'].dt.year
@@ -27,79 +24,136 @@ def load_data():
27
  df['shape'] = df['shape'].fillna('unknown').str.lower()
28
  return df
29
 
30
-
31
  df = load_data()
32
 
33
- # First visualization
34
  st.markdown("## 1. Temporal Trends in UFO Sightings")
35
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
 
37
- yearly_counts = df.groupby('year').size().reset_index(name='count')
38
- yearly_viz = alt.Chart(yearly_counts).mark_line(
39
- point=True
40
- ).encode(
41
- x=alt.X('year:Q', title='Year'),
 
 
 
 
 
 
 
42
  y=alt.Y('count:Q', title='Number of Sightings'),
43
- tooltip=['year:Q', 'count:Q']
 
 
 
 
 
 
44
  ).properties(
45
  width=700,
46
  height=400
47
- )
48
 
49
- st.altair_chart(yearly_viz)
50
 
51
  st.markdown("""
52
  **(1) Features Highlighted**
53
 
54
- This visualization emphasizes the temporal evolution of UFO sightings from 1949 to 2013, highlighting both the overall trend and year-specific fluctuations in reporting frequency. The visualization reveals distinct patterns of increased reporting over time, with notable spikes in certain periods that could correlate with significant historical events or changes in reporting methods.
 
 
 
 
55
 
56
  **(2) Design Choices**
57
 
58
  I implemented several key design elements for optimal data representation:
59
- + A line chart with point markers was chosen for its effectiveness in showing continuous time-series data while maintaining precise year-specific values
60
- + Interactive tooltips were added to provide exact sighting counts
 
 
61
 
62
  **(3) Potential Improvements**
63
 
64
- Given more time, I would implement dual y-axes to show both sighting frequency and duration, and add the capability to filter by time periods and incorporate monthly/seasonal analysis options.
65
  """)
66
 
67
- # Second visualization
68
  st.markdown("## 2. Analysis of UFO Shape Distribution")
69
 
70
- shape_counts = df['shape'].value_counts().reset_index()
71
- shape_counts.columns = ['shape', 'count']
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
72
  shape_viz = alt.Chart(shape_counts).mark_bar().encode(
73
  y=alt.Y('shape:N',
74
- sort='-x',
75
  title='UFO Shape'),
76
  x=alt.X('count:Q',
77
- title='Number of Reports'),
78
- color=alt.Color('count:Q',
79
- legend=None),
80
  tooltip=['shape:N', 'count:Q']
81
  ).properties(
82
  width=700,
83
- height=400
84
- )
85
 
86
  st.altair_chart(shape_viz)
87
 
88
-
89
  st.markdown("""
90
  **(1) Features Highlighted**
91
 
92
- This visualization focuses on the distribution of reported UFO shapes. The data shows clear preferences in how witnesses describe UFO shapes, with certain forms being consistently more common across reports. This analysis helps identify patterns in how people perceive and describe unidentified flying objects.
 
 
 
 
93
 
94
  **(2) Design Choices**
95
 
96
- The visualization employs several intentional design elements:
97
- + A horizontal bar chart format was chosen to accommodate long shape labels and enable easy comparison of quantities
98
- + Bars are sorted in descending order to immediately highlight the most common shapes
99
- + The chart focuses on the top 10 shapes to maintain clarity and prevent information overload
100
- + Interactive tooltips provide precise counts for each shape category.
 
101
 
102
  **(3) Potential Improvements**
103
 
104
- With additional time, I would expand this visualization by adding temporal analysis to show how shape distributions have changed over decades, incorporate geographical analysis to reveal regional patterns in shape reporting.
105
  """)
 
2
  import pandas as pd
3
  import altair as alt
4
 
5
+ st.title('UFO Data Visualization Analysis Report')
 
6
 
 
7
  @st.cache_data
8
  def load_data():
9
  columns = [
 
17
  header=None,
18
  names=columns
19
  )
 
20
 
21
  df['datetime'] = pd.to_datetime(df['datetime'])
22
  df['year'] = df['datetime'].dt.year
 
24
  df['shape'] = df['shape'].fillna('unknown').str.lower()
25
  return df
26
 
 
27
  df = load_data()
28
 
 
29
  st.markdown("## 1. Temporal Trends in UFO Sightings")
30
 
31
+ min_year = int(df['year'].min())
32
+ max_year = int(df['year'].max())
33
+ year_range = st.slider(
34
+ "Select Year Range",
35
+ min_value=min_year,
36
+ max_value=max_year,
37
+ value=(min_year, max_year)
38
+ )
39
+
40
+ time_agg = st.selectbox(
41
+ "Select Time Aggregation",
42
+ ['Yearly', 'Monthly']
43
+ )
44
 
45
+ if time_agg == 'Yearly':
46
+ time_counts = df[df['year'].between(year_range[0], year_range[1])]\
47
+ .groupby('year').size().reset_index(name='count')
48
+ x_encoding = alt.X('year:Q', title='Year')
49
+ else:
50
+ df['date'] = pd.to_datetime(df[['year', 'month']].assign(day=1))
51
+ time_counts = df[df['year'].between(year_range[0], year_range[1])]\
52
+ .groupby('date').size().reset_index(name='count')
53
+ x_encoding = alt.X('date:T', title='Date')
54
+
55
+ base_chart = alt.Chart(time_counts).encode(
56
+ x=x_encoding,
57
  y=alt.Y('count:Q', title='Number of Sightings'),
58
+ tooltip=['count:Q']
59
+ )
60
+
61
+ temporal_viz = (
62
+ base_chart.mark_area(opacity=0.3) +
63
+ base_chart.mark_line(color='steelblue') +
64
+ base_chart.mark_point(color='steelblue')
65
  ).properties(
66
  width=700,
67
  height=400
68
+ ).interactive()
69
 
70
+ st.altair_chart(temporal_viz)
71
 
72
  st.markdown("""
73
  **(1) Features Highlighted**
74
 
75
+ This visualization emphasizes the temporal evolution of UFO sightings, with enhanced interactive features:
76
+ - Adjustable time range using the slider
77
+ - Choice between yearly and monthly aggregation
78
+ - Interactive area and line combination chart
79
+ - Hover tooltips for detailed information
80
 
81
  **(2) Design Choices**
82
 
83
  I implemented several key design elements for optimal data representation:
84
+ + Combined area and line chart to show both trends and volume
85
+ + Interactive time range selection for focused analysis
86
+ + Multiple temporal granularity options
87
+ + Synchronized area, line, and point markers for clear data representation
88
 
89
  **(3) Potential Improvements**
90
 
91
+ Future enhancements could include seasonal analysis and correlation with historical events.
92
  """)
93
 
 
94
  st.markdown("## 2. Analysis of UFO Shape Distribution")
95
 
96
+ all_shapes = sorted(df['shape'].unique())
97
+ selected_shapes = st.multiselect(
98
+ "Select UFO Shapes to Display",
99
+ options=all_shapes,
100
+ default=all_shapes[:10]
101
+ )
102
+
103
+ sort_option = st.selectbox(
104
+ "Sort By",
105
+ ['Frequency', 'Alphabetical', 'Average Duration']
106
+ )
107
+
108
+ filtered_df = df[df['shape'].isin(selected_shapes)]
109
+
110
+ if sort_option == 'Frequency':
111
+ shape_counts = filtered_df['shape'].value_counts().reset_index()
112
+ shape_counts.columns = ['shape', 'count']
113
+ sort_field = '-x'
114
+ elif sort_option == 'Alphabetical':
115
+ shape_counts = filtered_df['shape'].value_counts().reset_index()
116
+ shape_counts.columns = ['shape', 'count']
117
+ sort_field = 'shape'
118
+ else:
119
+ shape_counts = filtered_df.groupby('shape')['duration_seconds'].mean().reset_index()
120
+ shape_counts.columns = ['shape', 'count']
121
+ sort_field = '-count'
122
+
123
  shape_viz = alt.Chart(shape_counts).mark_bar().encode(
124
  y=alt.Y('shape:N',
125
+ sort=sort_field,
126
  title='UFO Shape'),
127
  x=alt.X('count:Q',
128
+ title='Number of Reports' if sort_option == 'Frequency' else 'Average Duration (seconds)'),
129
+ color=alt.Color('count:Q', scale=alt.Scale(scheme='viridis')),
 
130
  tooltip=['shape:N', 'count:Q']
131
  ).properties(
132
  width=700,
133
+ height=max(len(selected_shapes) * 25, 400)
134
+ ).interactive()
135
 
136
  st.altair_chart(shape_viz)
137
 
 
138
  st.markdown("""
139
  **(1) Features Highlighted**
140
 
141
+ This enhanced visualization now includes:
142
+ - Multiple shape selection capability
143
+ - Dynamic sorting options
144
+ - Color encoding for better pattern recognition
145
+ - Interactive tooltips and filtering
146
 
147
  **(2) Design Choices**
148
 
149
+ The visualization employs several interactive elements:
150
+ + Multi-select dropdown for shape filtering
151
+ + Dynamic sorting options for different analytical perspectives
152
+ + Color gradient to emphasize differences
153
+ + Responsive height adjustment based on selected shapes
154
+ + Interactive tooltips for detailed information
155
 
156
  **(3) Potential Improvements**
157
 
158
+ Future enhancements could include geographical distribution analysis and temporal trend analysis by shape.
159
  """)