Timestamps are in UTC.
Tantek edited value-class-pattern "/* Basic Parsing */ note special handling of abbr, area, img elements (as before, from [[hcard-parsing]]))" (+284) http://is.gd/vttr
Tantek edited value-class-pattern "simplify and make consistent basic parsing and date time concatenation parsing" (-28) http://is.gd/vtL9
Tantek edited value-class-pattern "/* Parsing date and time concatenation */ add more details for consideration of parsing am/pm suffix in a separate time value, add reference to Wikipedia article on the 12 hour clock" (+186) http://is.gd/vu1O
ThomasLoertsch edited hrecipe "/* ingredient */ typo correction" (+2) http://is.gd/vzSz
Tantek edited value-class-pattern "/* Parsing date and time concatenation */ link fix" (+0) http://is.gd/vDbn
Tantek edited value-class-pattern "allow HH of 24 per ISO8601" (+0) http://is.gd/vFpK
Tantek edited html5 "added requests for examples and parsing details for HTML5 (if any are needed)" (+430) http://is.gd/vFXG
Tantek edited licensing-formats "fix link" (+131) http://is.gd/vGfG
Tantek edited licensing-formats "Reverted edits by [[Special:Contributions/Tantek|Tantek]] ([[User talk:Tantek|Talk]]) to last version by [[User:Mike Linksvayer|Mike Linksvayer]]" (-131) http://is.gd/vGgb
Tantek edited licensing-formats "actually fix link, and add a related section cross-linking to related pages" (+113) http://is.gd/vGh8
Tantek edited licensing-examples "added see also section" (+111) http://is.gd/vGhz
Tantek edited licensing-brainstorming "added see also section" (+112) http://is.gd/vGi6
Tantek edited licensing "group/simplify use case hypotheses to "What" and "How to attribute"." (+141) http://is.gd/vGku
Hixie, I've simplified what I see are reasonable and desirable use cases for a licensing microformat to two things: 1. *What* is being licensed, and 2. *How to attribute* (if necessary). I think that if a licensing microformat solved those two, then there wouldn't be a need for ccREL.
See http://microformats.org/wiki/licensing#Usecases_hypothesis for details of that 1 and 2.
seems like a link to a license would be good too
:-)
yes, amazing how we can forget the obvious :)
would be nice to have an algorithm somewhere that, given an HTML DOM, defines how you get a list of URL-license pairs (along with whatever other information is there)
one of the problems with microformats in general is that there isn't a clear way to distinguish where a microformat starts, e.g. "fn" and "vcard" appear the same to someone who doesn't know the hCard microformat
Tantek edited licensing "/* Usecases hypothesis */ added obvious "what is the license" use case hypothesis - thanks Hixie" (+215) http://is.gd/vGpp
this is both a pedagogical issue (noticing where the format starts and ends isn't clear, especially with lots of "css clases" around to muddle the issue)
Hixie, the clear way to distinguish where a microformat starts is the root class name. http://microformats.org/wiki/hcard-parsing#root_class_name
and a more practical issue for people who want to take pages and hand them off to an inference engine (e.g. wolfram alpha) or expose the data to an API (e.g. searchmonkey)
right, there's no way to know without knowing the format, that's what i mean
ah ok - what you're really asking (without knowing you were asking) is, given a URL to an XMDP (which is how microformats are defined, similar to DTDs defining your HTML), how do you know what is the root class name?
no, because i don't think anyone in practice will actually provide XMDPs
how is that any different from anyone in practice providing DTDs?
it's not -- you have to know the HTML spec to parse text/html. Which is a problem, and is why people use XML for a lot of things.
but that's ok because the crowd that uses XML for a lot of things is also very particular about including DTDs, thus can use XHTML in that way, and also be very particular about including XMDPs.
searchmonkey is a counter-example to that
is searchmonkey indexing random XML?
searchmonkey intends to expose arbitrary vocabularies that people come up with, including those microformats that might be invented later, but it can't parse them if it doesn't know about them
so searchmonkey provides incentive for inclusion of XMDP URLs then
because searchmonkey can simply read those XMDPs and discover new microformats in that fashion
new poshformats for that matter
the idea is that it would expose information on pages that don't have any relationship with the person who wants to expose the data
as anyone can write an XMDP that defines the classes and ids they are using
e.g. how it can expose hCard on sites today
those sites don't expose XMDP
sure
anyway, DTDs are an anti-pattern that i'm pretty sure we don't want to replicate here :-)
especially after we got rid of them for HTML
then they might as well just index all classes and IDs
sure but how do they know which classes and IDs are "root" classes and IDS?
there have been proposals to include that in the microformats themselves
without the need for XMDP
BTW I don't see DTDs as much as an anti-pattern as a threshold to meet to answer the "URI extensibility" crowd's questions/issues, since that crowd seems happy with DTDs, all you need is something "just as good".
Hixie, the problem of indicating the root class is also one that overlaps with indicating when a parser has encountered another piece of microformatted information that it shouldn't be looking inside for properties for some microformat higher up in the parse tree.
there is some thinking on that here: http://microformats.org/wiki/mfo
anyway, solutions proposed to date (e.g. RDFa) are generally considered too fragile (authors/developers won't maintain them) to be worthy of pursuing
I'd rather say, there is no solution now, than offer a solution which is expectedly fragile and due to induce/cause data loss in the future.
however, it all you're looking to do is *experiment* with a format (say in a specific vertical area perhaps not significantly published on the web in order to merit a microformat), then random XML, or RDFa etc. may make sense
but even in those cases, I encourage people to do what web designers have been doing for years - just use semantic class names of your own
or the larger practice of semantic HTML
http://microformats.org/wiki/posh
btw, speaking of anti-patterns
the whole notion of wanting to "expose arbitrary vocabularies" is fairly fundamentally flawed, or rather, will only result in Babel.
Tantek edited xmdp-brainstorming "/* root class name identification */ added inline alternative possibility" (+1301) http://is.gd/vGLh
tantek: i don't really care about that crowd as you put it, i'm more worried about things like people wanting to mark up their family history and have searchmonkey then expose it, without them having to get Yahoo! to implement their custom little set of class values
tantek: i don't see why my dad shouldn't be able to mark up his family history pages in a way that he can then extract data from in a consistent manner using tools
tantek: we already have the tower of babel "problem" with class names in general
incidentally what is "hreview-aggregate"? i see it on http://www.tripadvisor.com/Hotel_Review-g155032-d185642-Reviews-La_Maison_Pierre_du_Calvet-Montreal_Quebec.html
but it doesn't seem to be on the hreview wiki page
oh, found it on http://microformats.org/wiki/aggregate-microformat-template-brainstorming
not so hypothetical i guess
hReview aggregate is a microformat being pursued by a bunch of folks, some of them at Google
cool
interesting, I didn't realize sites were trying it out in the wild yet
regarding family history / genealogy, there's the "just use hCard + XFN" answer, and then there's more here: http://microformats.org/wiki/genealogy
and for marklin model trains, is there a microformat for that too?
microformats aren't going to cover everything people want
which is fine
precisely, and that's by design!
http://microformats.org/wiki/microformats#microformats_are_not
since inception
but that doesn't mean that small groups of people aren't going to want to do the same kind of thing with their own pages in small groups
e.g. a bunch of students in a class writing content that the professor aggregates
(university of mary does this with blogs iirc)
right now it's not clear how to write a generic tool to handle arbitrary vocabularies
Hixie, there is a larger problem which is that small groups of people are rarely going to have the necessary experience/skills to actually produce a good data format
including vocabulary
it doesn't have to be good
it just have to be good enough to work for them
so it is inevitable that there will be a bunch of one-off experimental short term vocabularies
sure, for those folks, they can "just use XML" as it has been used to date
i don't think that's reasonable
they have html content, why can't they annotate it like the big boys?
see above - it takes "big boys" with necessary experience/skills to actually produce a good data format
alternatively they can do what web designers have been doing before microformats even existed
just use semantic class names and posh in general http://microformats.org/wiki/posh
perhaps even create their own poshformats if they really want to try to start creating a "format" per se: http://microformats.org/wiki/poshformats
there are entire disciplines dedicated to this kind of thing, like Information Architecture
it is actually much more unresasonable to expect that small groups of people are going to be able to create something which actually takes some amount of expertise (in known and studied fields) to create
i think it's unreasonable to say that a small group of people can't use generic tools to annotate their markup
for that matter, i think it's unreasonable to require that every microformat has to have its own dedicated parser that knows about all the class names to get the data into a reusable and exposable data structure
even text/html's parser doesn't know about all the tags in html. :-)
and that's a pretty screwed up language :-)
it's the very act of annotation, in any meaningful sense (using a vocabulary etc) that's the hard part
it's not unreasonable to say that, just as it's not unreasonable to say that a small groups of people can't use generic tools to build an airplane
but they can
not without being skilled
you can totally use generic CAD programs and CNC lathes and so forth
i've done it myself
and i'm hardly skilled
you might be able to build something that looks like an airplane, but it won't actually fly
sure, but they can still use the tools
similarly, groups may be able to build something to looks like decent annotations, but it won't actually work to share data
it doesn't need to work on a large scale, it only has to be good enough for their own needs
right, the airplane they build won't go short distances, across town even, nevermind large scales like across continents
tell that to the FIRST robotics teams
it's not that much different than saying they can't write code just for their own needs too
every year, kids around the world build robots using generic tools that work very well
and getting basic annotation-like stuff working on a limited basis isn't anywhere near as complex as building a robot
it might be more complex/difficult actually
because with annotation, the tests are all abstract, data etc.
whereas with a robot, physics gives you good feedback at every step as to whether you're making a mistake or making progress
if you have a specific goal in mind, e.g. aggregating a bunch of documents, then testing is easy.
and that's the kind of thing we're talking about here.
specific concrete goals that need minor annotation support, and the ability to use generic tools to achieve the results
sure, and you might say RDF has been trying to solve that problem for 15+ years
RDF fails to solve that problem on so many levels it's not even worth discussing here
that's not the point
then what's the point? one group failed so we must give up?
the people who set out with the goal of creating generic tools for annotation have followed that path
not necessarily give up, but perhaps solve simpler problems first
learn from solving those simpler problems, perhaps with hardcoded vocabularies, parsers etc.
i believe the problem i'm describing is orders of magnitude simpler than the problems the RDF community set out to solve
and *maybe* we might have sufficient wisdom some day to actually produce a solution to the generic tools problem
Hixie, it doesn't sound like it
either you underestimate the kind of problems they want to solve, or you overestimate my ambition :-)
your scoping of annotation by groups and using "generic tools" matches the RDF community
so does my using english :-)
in the case of English, so does microformats http://microformats.org/wiki/en-US
my point is that just because there are similarities doesn't mean the same problem is being solved. That, if anything, is the mistake many RDF proponents make a lot.
sure, I tend to agree with that statement
however, if two things resemble each other, it's important to point out why it's different
otherwise you fall prey to the "quacks like a duck" problem
ok, things rdf attempts to solve that i have no interest in solving: a generic data model that can describe anything; preventing name clashes; defining schemas; ability to reason or perform inference based on the data; ability to translate from one vocabulary to another
and arbitrary annotation, small groups, generic tools are all very much "smells like RDF", or before RDF, smells like "knowledge representation" an already well trudged field of AI
it also smells of xml and json
and sgml
RDF itself has had (still has) difficulty distinguishing itself from KR
http://en.wikipedia.org/wiki/Knowledge_representation
"ability to reason or perform inference based on the data; ability to translate from one vocabulary to another" are more OWL than RDF
generic data model sounds like arbitrary annotations
(vice versa)
"defining schemas" is not that different from "defining vocabularies" especially if you start to add things like, what's the root, etc.
that just leaves "preventing name clashes" - so what you've described then is just RDF without namespaces
i disagree with most of that but the key one is that i don't think we should define what the root is in a definition of a vocabulary
i think it needs to be syntacticaly self-evident
interesting
we have found cases where it was useful to alter the definition of what was a root after the fact, and without problems
e.g. in hCard the root is the class name "vcard"
however we found use cases for separate/lone addresses and geo coordinates
and thus defined the "adr" and "geo" microformats as proper subsets of hCard with those two respective class names as roots
yeah i was looking at adr in particular recently
thus, I would offer that experience to date shows that it might not be desirable to require that the root is syntactically self-evident.
and wondering whether if we did add some way to indicate a "root" of some kind, it might make sense to have a way to indicate that a root is still part of another object as well
I believe we approached an isomorphic problem from a different direction, that is, it can be useful to be able to know that a contained microformat's properties should not affect the container - thus mfo http://microformats.org/wiki/mfo
and this has been based upon the experience of developing hAtom and hAudio
again - as before - reasoning from experience rather than just a priori needs
tantek: Blog and wiki themes both updated to use value-class-pattern, so will deploy them when we're ready.
tantek: Additionally, going to pop a new static header above the blog on the front page linking to more… regularly updated… sources of µf news.
Thinking: Twitter account, github presence, IRC channel… and anything that gets suggested to me before we go live.
certainly twitter and IRC
oh wait
there's already stuff in the header for most of those
github presence should probably just go on http://microformats.org/code-tools/
Who maitains code-tools?
admins
for IRC, perhaps we should raise that above the mailing lists on http://microformats.org/discuss/ and add some more description
Let me try again with what I'm talking about adding. The top nav is fine, though some of those pages need to be _actually_ maintained, not just theoretically by ‘admins’. I mean, where the first blog entry currently sites, have a little box with quick links for people to springboard to some more active, uptodate microformats activity.
sure - or perhaps even where the "What are microformats" box sits
(or just below it)
for github in particular I was serious - in that rather than adding it to the home page, code-tools makes sense
Tantek edited to-do "/* Tantek */ remove items that have been completed, note current activity/prioritization on [[value-class-pattern]] and what it affects, add a task to contribute XFN+XMDP from gmpg to microformats.org" (+107) http://is.gd/vIYv
These logs were automatically created by mflogbot on chat.freenode.net using a modified version of the Java IRC LogBot.
See http://microformats.org/wiki/mflogbot for more information.