Web source: Database index (planecrashinfo.com)
Goal is to extract the accident details for each year and date. That means get a access to level 3 data.
data:image/s3,"s3://crabby-images/ff7d1/ff7d123746a3edb0693ae977d90a4e921e3d5fd7" alt=""
Step 1
Bring the Accident Details page in.
data:image/s3,"s3://crabby-images/99b94/99b94ab07385513a16cbd6c7912c170dfcacf69d" alt=""
Step 2
Pivot
Go to: Transform > Pivot > Advanced options > Don't Aggregate
When data has been pivoted you can decide which columns (15) you want to keep or remove. You can also do data cleaning transformations. I will not demonstrate that, my goal is just focus on extracting the data.
data:image/s3,"s3://crabby-images/ed68a/ed68a84e1e9e5559d782e10f110c7fd0b79431b7" alt=""
Step 3
Get years.
data:image/s3,"s3://crabby-images/ad8eb/ad8eb81f78cc8d4d63ff7fdbeafa273690f6689b" alt=""
Unpivot Columns
data:image/s3,"s3://crabby-images/86e3c/86e3c9158c01cfd4bee4329c587b2f672ce83835" alt=""
Keep just column with years, remove bottom rows (8 empty rows)
data:image/s3,"s3://crabby-images/7e638/7e6388565e27fd378ad5641fd4152cf1b4b7590a" alt=""
By now I have the access to level 1 data. Next task is to get access to level 2. Through level 2 I will be able to reach on data in level 3.
Step 4
Get Level 2 data. Remove all the columns and keep just one, dates. I will use these data to build a function to unlock data in level 3.
data:image/s3,"s3://crabby-images/ddc55/ddc5550ed76a3ce61114115c6b3a536f639815f3" alt=""
Step 5
After I have removed all the other columns I transform this dataset into a function.
data:image/s3,"s3://crabby-images/d0d5c/d0d5c60a3b35a7040540b8b2f5b5ef723f823a06" alt=""
Step 6
(Year)=>
let
Source = Web.Page(Web.Contents("http://www.planecrashinfo.com/"&Year&"/"&Year&".htm")),
As the URL is a text, Year variable must be a text too.
data:image/s3,"s3://crabby-images/08531/0853163dea5aadf53cace327f4c1e6f561385c7e" alt=""
Step 7
In step 4 I extracted LEVEL 2 data. I will return them back here using tblHelp function.
data:image/s3,"s3://crabby-images/374f2/374f29d79c1c802e45a4cd993a65fe616e592638" alt=""
Step 8
For year 1920 there are 51 date records in that table. I will add an index to each date.
data:image/s3,"s3://crabby-images/1b356/1b356fe55e55357c41a54bb1a96b978023debc37" alt=""
Step 9
We add an Index to the Date column.
=>Add custom column
=Table.AddIndexColumn([Custom],"Index",1,1)
data:image/s3,"s3://crabby-images/5dff5/5dff51e1c8b3d4a8da0636b29d74ea7c4138b5c8" alt=""
Expand IndexingCustomColumn column.
data:image/s3,"s3://crabby-images/2b2b1/2b2b1dff706ab0fd48ed2c70949c60a551461820" alt=""
Remove unnecessary columns.
data:image/s3,"s3://crabby-images/bddc0/bddc04690f09ec8c0cd0c686b9a2674a72f89411" alt=""
Now we have this key to get to access to Level 3 data, AccidentDetails.
data:image/s3,"s3://crabby-images/83734/83734b2fb22775debd4971de726958d1ea38847d" alt=""
Step 10
I convert this dataset into a function.
data:image/s3,"s3://crabby-images/690ea/690ea929b9d69e26e933782c246f208365e9b597" alt=""
(year, index)=>
let
Source = Web.Page(Web.Contents("http://www.planecrashinfo.com/"&year&"/"&year&"-"&index&".htm"))
data:image/s3,"s3://crabby-images/9eca4/9eca42bf66a6b2c44d4b4a43ea2d03001556a5ee" alt=""
This function needs 2 arguments, which will be supplied form AirAcciddents table.
data:image/s3,"s3://crabby-images/90faa/90faa63e59c4cac34086ef7e8ae0493c0d4e87eb" alt=""
Step 11
Add Custom column, use AccidentDetails function.
data:image/s3,"s3://crabby-images/dbbac/dbbac85f233df0e1c2049bc752e9f05284f2a2e7" alt=""
Expand the column names from the FinalTable and bring them in.
data:image/s3,"s3://crabby-images/f1540/f1540ecb0121474b08a893a35b97d54389eaec06" alt=""
Now I have all the accident details in one table.
data:image/s3,"s3://crabby-images/798e2/798e23e3146eaa95e3a4b9a16c34cdf3c39abed6" alt=""
Conclusion: It takes 5 hours to Power BI Desktop to load all the rows. MS Excel can load these data in 1 hour.
data:image/s3,"s3://crabby-images/0f715/0f715709ef15ae3c6877696f038de92c732584d1" alt=""
Comments