User:Adewalker/Gigs Reconciliation Jan 2015
Introduction
Decided to reconcile the Main Gigs page listings with the various categories currently in use, namely Category:Gigs_by_country and Category:Gigs_in_location, to make sure that we have correctly categorised all the gigs. Thanks to the Gigbox template, this categorisation is taken care of automatically and will result in a correct categorisation (unless there is a location or country error in the Gigbox). However, quite a few gig pages do not use the Gigbox, and therefore this automatic categorisation doesn't take place.
Methodology
Here's what I did.
Data from Gigs templates
- Copied all the 1991 to 2014 gig template contents from the Main Gigs page into a spreadsheet
- In my spreadsheet, added additional columns for Year, Redlink and Cancelled so that I could filter the data as required.
- Year is useful for sorting as some dates are incomplete, eg 1996-10-??, and are therefore treated by Excel as text rather than a date.
- Redlink gigs will not appear in any of the gig categories, cos the gig pages don't exist, so I added a flag in the relevant rows in my spreadsheet.
- Cancelled gigs do not get categoried either (as per the Gigbox code), so these too needed to be flagged.
- I decided to ignore 2015 as this list is still being added to and there is still updating going on with these gigs.
Analysis of Gigs template data
Results so far:
- Total number of Gig template gigs = 1306
- 1991 = 2
- 1992 = 1
- 1993 = 1
- 1994 = 6
- 1995 = 14
- 1996 = 4
- 1997 = 33
- 1998 = 20 (Redlink = 2)
- 1999 = 147 (Redlink =12)
- 2000 = 146 (Redlink = 42)
- 2001 = 120 (Redlink = 12)
- 2002 = 38
- 2003 = 53
- 2004 = 147 (Redlink = 2) (Cancelled = 4)
- 2005 = 23
- 2006 = 106 (Cancelled = 2)
- 2007 = 111 (Cancelled = 10)
- 2008 = 17
- 2009 = 56
- 2010 = 101 (Cancelled = 4)
- 2011 = 18
- 2012 = 42 (Cancelled = 3)
- 2013 = 96 (Cancelled = 1)
- 2014 = 4 (Cancelled = 1)
- Total = 1306
- Total number of Redlink gigs = 70
- Total number of Cancelled gigs = 25
- None of the Cancelled gigs are Redlinks
From this raw data:
- Exclude Redlinks and Cancelled gigs, ie 70 + 25 = 95 to be excluded
- Theoretical number of gigs categorised by country and location should be 1306 - 95 = 1211, based on Gig/year boxes on Main Gigs page
Check quality of Gigs template Country data
Errors with the naming of country (location) data in the Gigs/year templates. These errors probably arise because these tables are keyed manually, and therefore naming of countries or provinces/countries are not forced to comply with the naming standards we use in actual Gig pages (thanks to the Gigbox).
Fixes required (in Gigs/year templates):
- Gigs/2000 - 2000-08-18 Flughafen - NW Germany (should be NW, Germany)
- Gigs/2000 - 2000-08-27 Lowlands Festival - Holland (should be Netherlands)
- Gigs/2001 - 2001-08-04 Witnness - Rep of Ireland (should be Ireland)
- Gigs/2001 - 2001-08-24 Lowlands Festival - The Netherlands (should be Netherlands)
- Gigs/2004 - 2004-01-19 Metro Theatre - NS Australia (should be NS, Australia)
- Gigs/2004 - 2004-05-31 Megaland - LI Netherlands (should be LI, Netherlands)
- Gigs/2004 - 2004-07-10 Balado Park - shown as Scotland (should be UK)
- Gigs/2008 - 2008-07-30 Vivo Rio - Brasil spelling issue (should be Brazil)
- Gigs/2010 - 2010-07-09 Balado Park - shown as Scotland (should be UK)
- Gigs/2013 - 2013-09-14 Parque Olimpico - Brasil spelling issue (should be Brazil)
Note - these errors have no bearing on the actual categorisation of these gigs by country, as the actual category is established by the Gig Page's Gigbox, not the Gigs/year template listing. The reason they need fixing is (a) to be consistent with naming conventions and country names as generated by the Gigbox and (b) to allow me to reconcile my spreadsheet with the actual Gigs in {country} categories.
Other issues noticed:
- Another issue noted with country names in the Gigs/year templates is inconsistent use of province/state/region initials. For example, some Netherlands gigs have the state initials, others don't. Same with Germany gigs.
- Brasil vs Brazil. This is an issue at Category level too (to be discussed further down).
Reconcile Excel with Gigs in Country categories
Countries
Having corrected the spreadsheet data (but not the actual errors in the various Gigs/year templates), an initial reconciliation identified the following redlink categories which needed to be created and made subcategories of Gigs by Country:
- http://musewiki.org/Category:Gigs_in_China
- http://musewiki.org/Category:Gigs_in_Croatia
- http://musewiki.org/Category:Gigs_in_Estonia
- http://musewiki.org/Category:Gigs_in_Indonesia
- http://musewiki.org/Category:Gigs_in_Malaysia
- http://musewiki.org/Category:Gigs_in_Monaco
- http://musewiki.org/Category:Gigs_in_Romania
- http://musewiki.org/Category:Gigs_in_Serbia
- http://musewiki.org/Category:Gigs_in_Singapore
- http://musewiki.org/Category:Gigs_in_Taiwan
Note: Category description is created with the {{CountryCat}} macro.
I also had to add the following existing categories to the Gigs by Country:
Once done, the total number of countries = 48 in both the spreadsheet and the Gigs by Country category. Hooray!
Gigs in Countries
Next step was to check that the number of gigs in the various Gigs in {Country} categories agreed with the spreadsheet, which identified the following issues:
- Problems with Lowlands/Hasslet discussed here: http://musewiki.org/Talk:Hasslet_Lowlands_Festival_2000_(gig)
- The Gigs in Brasil category shoudl have been named Gigs in Brazil, to be consistent with use of English spellings. For the moment, I have left as "Brasil", though it bugs me. :-)
Results
The resulting reconciliation, after having addressed the above issues (except Brasil vs Brazil) is shown in the below table.
Column headings:
- Gig tables = number of gigs as per the analysis in my spreadsheet, which is based on the contents of the various Gigs/year templates (as displayed on the main Gigs page), EXCLUDING Cancelled gigs.
- Redlinks = number of gigs (as per gig page analysis) which are redlinks, ie the Gig Page itself doesn't yet exist.
- Gig pages = 1st column minus 2nd column. This is the figure that we need to reconcile with the category data. Obviously, redlink gigs won't appear in any Gigs in {Country} category.
- No. gigs in actual Category = self-explanatory, ie how many in each "Gigs in {country}" category, EXCLUDING 2015 gigs (20 as at 2015-01-10)! Numbers in () include 2015 gigs as at 2015-01-10.
Country | Gig tables | Redlinks | Gig pages | No. gigs in actual Category |
---|---|---|---|---|
Argentina | 6 | 6 | 6 | |
Australia | 57 | 4 | 53 | 52 |
Austria | 13 | 13 | 14 (15) | |
Belgium | 18 | 18 | 18 (19) | |
Brazil | 8 | 8 | 6 | |
Canada | 30 | 30 | 30 | |
Chile | 2 | 2 | 2 | |
China | 2 | 2 | 2 | |
Colombia | 1 | 1 | 1 | |
Croatia | 1 | 1 | 1 | |
Czech Republic | 2 | 2 | 2 | |
Denmark | 15 | 1 | 14 | 14 (15) |
Estonia | 1 | 1 | 1 | |
Finland | 11 | 1 | 10 | 10 (11) |
France | 120 | 7 | 113 | 113 (116) |
Germany | 93 | 10 | 83 | 81 (83) |
Greece | 4 | 1 | 3 | 3 |
Hungary | 4 | 4 | 4 | |
Iceland | 1 | 1 | 1 | |
Indonesia | 1 | 1 | 1 | |
Ireland | 19 | 3 | 16 | 14 |
Italy | 39 | 2 | 37 | 37 (38) |
Japan | 44 | 44 | 45 | |
Latvia | 3 | 3 | 3 | |
Luxembourg | 3 | 3 | 3 (4) | |
Malaysia | 1 | 1 | 1 | |
Mexico | 10 | 10 | 10 | |
Monaco | 1 | 1 | 1 | |
Netherlands | 24 | 1 | 23 | 23 (24) |
New Zealand | 6 | 6 | 7 | |
Norway | 17 | 3 | 14 | 14 |
Poland | 3 | 3 | 3 (4) | |
Portugal | 12 | 12 | 12 (13) | |
Romania | 1 | 1 | 1 | |
Russia | 7 | 7 | 7 (9) | |
Serbia | 1 | 1 | 1 | |
Singapore | 2 | 2 | 2 | |
South Africa | 2 | 2 | 2 | |
South Korea | 5 | 5 | 5 | |
Spain | 26 | 1 | 25 | 25 (26) |
Sweden | 12 | 2 | 10 | 10 (11) |
Switzerland | 25 | 1 | 24 | 23 (24) |
Taiwan | 1 | 1 | 1 | |
Turkey | 4 | 1 | 3 | 3 |
UK | 330 | 11 | 319 | 316 (317) |
Ukraine | 2 | 2 | 2 | |
UAE | 2 | 2 | 2 | |
USA | 289 | 21 | 268 | 270 |
TOTAL | 1281 | 70 | 1211 |
Issues to be resolved
Note: Above table is updated as issues listed below are fixed.
- Austria
- http://musewiki.org/Vienne_2003_%28gig%29 - Gig Page exists, but not listed in Gigs/2003
- Brazil
- Gig page missing for Rio 2013 (2013-09-14)
- Gig page missing for Sao Paulo 2014 (2014-04-05)
- France
- 2004-07-19 shown as Les Moutiers in Gigs/2004 and just Moutiers in Gig page. No impact on reconciliation.
1997-11-21 http://musewiki.org/Le_Charleston_1997_%28gig%29 Gig Page should be renamed as Cherbourg_Le_Charleston_1997_-_21st_(gig)Done. Page Moved 2015-01-11. No impact on reconciliation.1997-11-27 http://musewiki.org/Cherbourg_Charleston_1997_%28gig%29 Gig page should be renamed as Cherbourg_Le_Charleston_1997_-_27th_(gig)Done. Page Moved 2015-01-11. No impact on reconciliation.2010-06-12 http://musewiki.org/Paris_Stade_de_France_2010_-_12th_%28gig%29 - Gig Page exists, but not categorised correctly, therefore missing from Gigs in France category. To be investigated.Fixed 2015-01-11.- Solidays Festival - http://musewiki.org/Solidays_Festival_2000_(gig) - Should be renamed. No impact on reconciliation.
- Aix-les-Bains - http://musewiki.org/Esplanade_du_Lac,_Aix-les-Bains_2015_(gig) - Should be renamed. No impact on reconciliation.
- Germany
- 1999-10-31 - Incognito - Munich, Germany = Appears in Gigs/1999 in plain text. Gig Page doesn't exist. Action required = change to redlink in Gigs/1999 template.
- 1999-10-30 - Knaack - Berlin, Germany = Appears in Gigs/1999 in plain text. Gig Page doesn't exist. Action required = change to redlink in Gigs/1999 template.
- 1999-10-29 - Logo - Hamburg, Germany = Appears in Gigs/1999 in plain text. Gig Page exists here: http://musewiki.org/Hamburg_Logo_Club_1999_(gig). Action required = change to actual link in Gigs/1999 template.
- Ireland
- 1999-10-28 - Whelans - Dublin, Ireland = Appears in Gigs/1999 in plain text. Gig Page doesn't exist. Action required = change to redlink in Gigs/1999 template.
User:Adam0209M appears in error in Gigs in Ireland category.Fixed 2015-01-11.- http://musewiki.org/Slane_Castle_2000_%28gig%29 - Gig page exists, but it needs Gigbox, hence currently this page doesn't appear in the Ireland gig category.
- Japan
- http://musewiki.org/Osaka_2001_%28gig%29 - Appears to be a duplicate of http://musewiki.org/Osaka_Club_Quattro_2001_%28gig%29. Action required = remove content and redirect to correct page.
- New Zealand
- This gig: http://musewiki.org/New_Zealand_2007_%28gig%29 appears to be gig announcement info relating to either http://musewiki.org/Auckland_Mt_Smart_Stadium_2007_(gig) or http://musewiki.org/Christchurch_Westpac_Arena_2007_(gig) (less likely). As there are no references for any of the content, not sure what to do with it. Perhaps remove Gigbox and add a link to the two 2007 NZ dates.
- Switzerland
- This gig should be renamed (ie moved): http://musewiki.org/Festival_Soundarena_2001_%28gig%29. No impact on reconciliation.
- http://musewiki.org/St._Gallen_OpenAir_Festival_2000_%28gig%29 exists, but no Gigbox, hence no appearance in Gigs in Switzerland category. Action required = add Gigbox and content to page.
- UK
- Needs renaming: http://musewiki.org/Action_Records_1999_%28gig%29 No impact on reconciliation
- Not in Excel: http://musewiki.org/Ashburton_Lanterns_Hotel_1997_(gig)
- Not in Excel: http://musewiki.org/The_Barfly
- Needs renaming: http://musewiki.org/BBC_Studios_1999_(gig) No impact on reconciliation.
- http://musewiki.org/Glasgow_The_Garage_2000-06-02_%28gig%29 Gig Page needs country changing to United Kingdom
- http://musewiki.org/Leeds_Leeds_Festival_2000_%28gig%29 Gig Page needs Gigbox.
- Needs renaming: http://musewiki.org/National_Youth_Theatre_1991_%28gig%29 Also, although stated as "London" in Gigbox, there is no source to confirm this. No impact on reconciliation.
- Needs renaming: http://musewiki.org/The_Barfly No impact on reconciliation.
- USA
To be continued... Still working on it...