asquirous commited on
Commit
388205b
·
verified ·
1 Parent(s): 7de325e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -7,4 +7,10 @@ sdk: static
7
  pinned: false
8
  ---
9
 
10
- BigBanyanTree is an initiative to empower engineering colleges to set up their data engineering clusters and drive interest in data processing and analysis using tools such as Apache Spark.
 
 
 
 
 
 
 
7
  pinned: false
8
  ---
9
 
10
+ BigBanyanTree is an initiative to empower engineering colleges to set up their data engineering clusters and drive interest in data processing and analysis using tools such as Apache Spark.
11
+
12
+ As part of that initiative, we have open-sourced datasets processed from CommonCrawl data.
13
+
14
+ The datasets offer two subsets having the specified columns:
15
+ "script_extraction": ["ip", "host", "server", "script_src_attrs"]
16
+ "ipmaxmind": ["ip", "host", "server", "postal_code", "latitude", "longitude", "accuracy_radius", "continent_code", "continent_name", "country_iso_code", "subdivision_code", "city_name", "metro_code", "time_zone", "year"]