On May 31 2023, educators, learners, and artificial intelligence (AI) experts spoke to around 200 people at the Assessment in the Age of AI symposium.
The symposium explored what good assessment could look like in a world with advanced AI tools.
Presenters shared ideas for designing assessments in ways that support learning and reduce academic integrity issues.
The event was not intended to arrive at comprehensive solutions but to provide ideas and help educators consider the possible impacts of AI.
Reflecting the broad response to AI that will be required across education, the symposium was coordinated by a coalition of organisations:
- New Zealand Qualifications Authority
- Ministry of Education
- Universities New Zealand
- New Zealand Assessment Institute
- New Zealand Council for Educational Research
- Education Review Office
- Post Primary Teachers’ Association
- Network for Learning.
This page contains video and presentations from the symposium.
On this page
Video transcript
Download the presentation - The dawn of AI [PDF, 1.6 MB]
Transcript
First speaker, uh, coming up very soon, and I'm going to, and I'm going to introduce him, um, and he is all ready to go. So if we're all good to go, gonna, Simon McCallum is a senior lecturer in computer science at the Victoria, university of Wellington. Simon has been teaching computer science since 1999 with game specific courses from 2004 at undergraduate level. He has taught everything from game design with a focus on system design to GPU programming and multi-threaded optimization for graduate level research.
He focuses on serious games, mainly games for health and games for education. So, um, as was introduced, I've been teaching games for a long time, um, and part of teaching game development for the last 20 years has been that every year the game engines change.
Every year. What I've been teaching has had to change constantly because I never know what my students will be capable of when I get to the beginning of a course. And it will change during that course. Interestingly, everybody now faces that challenge.
Everybody faces a challenge of not knowing what the tech is capable of in their area. So, um, I thought today I'd talk about a bit about sort of understanding ai, where it's come from, where it's going, uh, some of my approaches to assessment. Um, so I, I, I also currently now have a, an adjunct position at, at, uh, central Queensland University as well, because they want me to help them update their education. So I'm now three universities, which is a bit weird.
Um, so I thought I'd go through understanding generative ai, try and up skill, kind of some of that understanding, and then go into the uses and examples. So, um, large language models, the chat GPTs, the things we're seeing now were built with the intent of doing translation, right? So in the, this large language model was built around the idea of finding meaning in sentences so that I could translate them from one language into another language. Now, what's amazing is that because languages are so complex, there is a lot of meaning embedded in the usage of language, right?
Um, and sort of the, the stein approach of, of meaning is usage. That mapping of words into a meaningful context actually led to an understanding of the world in some way. Some shadow of reality is represented through that kind of prism of language that we use. And so when I say something like the old man, the old man's glasses are filled with, what are glasses referring to there?
If I said whiskey there tumblers in front of him, right? If I said tears, there is spectacles. I, so you don't know what that word means until I fill the next thing. And it's not even that.
The next word is a liquid that changes the meaning of glasses. Cuz tears and whiskeys are both liquids, right? And yes, he could have physical tumblers in front of him filled with tears, but that would be a bit weird. But we only know that from our understanding of the world.
We don't actually know it from the language itself. So we're in this interesting mix of some of what we mean is embedded in the words we use, and some of it isn't. So what large language models do have done is that they start by building a mapping of words into a rich vector representation. What I mean by that is we take the individual letters, which we see, and we put them into, in this case, a 500 vector dimensional space.
Most of us, like looking at maybe two or 3D 500 dimensions is actually really hard to hold in your head. It's not something our brains are designed to do, but so that's why we represented it as a 3D space. But you have to think all of these axes are different metrics around words. So here we have a, a gender, um, vector of man going to woman is something like king going to queen.
And the other vector is something around authority, right? But then other words would have some verb or action relationship, right? So there's, there's having walking and having walked talk about some sort of temporal component and walking versus swimming talks about some sort of activity component. And so what we do is for every word it learns where that word sits in the 500 dimensional space.
Um, and then what it does is it builds each of those representations. It looks at how that sits in the position, in the sentence that you are building. Okay? So when you, when you build a sentence, each of those words has some sort of meaning.
That meaning might depend on the words that are around it, just like the old man's glasses depends on the words that are around it to get its final meaning, right? So the position and the context matter chat, g p t has a context window of about 2000 tokens, right? These are these representations of concept concepts. Um, GPT four has is going up to about 32,000.
The version I'm using is about 8,000. So, um, there's a, it, it's got more of a context. And that context window is basically like our working memory. It's how much of the sentence we can remember at any one moment.
This is of course very different to the way humans operate. We, we have a relatively small context of working memory, which we augment with, um, short term memory. So when we look at what it's actually doing, so when I say my family's from auDA, um, I have, I turn each of those into context. I then build additional tokens, which are additional representations that are combinations of the words around it and its context in the sentence, right?
And so I have to build it up and actually I've only shown you kind of two layers here. Um, GVT four had at least six layers deep, um, and multiple attention layers wide. Um, but you can see kind of family has to migrate from the beginning of the sentence somehow through to the end of the sentence, right? Cause I can't just literally translate each word in place, right?
What happens is with translation is we mo moved it into this middle orange space, sort of a meaning layer, right? And so each of this is how the system's working and that sometimes in images they talk about that being the latent space. You might have heard this is the space where astoundingly right? And where this is, this game is a prize in about 2019.
Um, most languages seem to map to the same meaning space. Humans seem to talk about the same kinds of things, right? We talk about appearance, we talk about our day, we talk about the weather, we talk about our experiences. And it seems that all humans actually in the end have similar experiences.
And so the way we use language maps into a meaning space, which we can then just magically extract by decoding from the meaning space into an expression. Um, and so this is why, um, Google found this this year when they were adding another language, they added, you know, five or six sentences of it and suddenly the system was very good and nearly fluent in that new language. And it's kind of what the, how did it do that? And it's because it was able to take this meaning space and then find that mapping into expression relatively easily.
Once you were able to find a meaning space that was relevant. Now what that meaning space is missing is grounding in reality. It doesn't actually experience the work. So this is purely extracted from each other.
So it is a Blind having Without experience. No, Surprisingly it gets a lot of it, right? Um, so when we look at, at building context, when you look at this system here, you say, well, okay, how do I get good output from it? Well, if it's got this sort of meaning space, what I need to do is trigger parts of that meaning space to extract those relationships.
Cuz when you start with a bland prompt and, okay, who, who's used chat? G p T, excellent. How many of you have found that some of its answers are a bit bland and uniform good? Yes.
That's, that's the appropriate feeling. Um, so what we do to make it more interesting is we prompt it, we give it some pretext and some post text and we move that meaning space to where we want it to be. In fact, there are a lot of similarities between a piha and good prompt engineering. Because a piha gives you the context, gives you my context, make, helps you understand me a bit better.
If I then feed that into the AI system, I can give it more context. I can give it more meaning so that when it extracts words from that concept, meaning space, it generates more interesting words. Now of course, the one of the things that you can then do as a student would be you feed in a previous essay you've written, ask it to analyze that for context, give it some of the other courses you're doing or some of your other knowledge or the fact you really like rugby. And then you ask it to write the answer to the question.
And it's no longer bland because it now has that additional context and it's moved it into the meaning space in that large contextual token mapping of reality. It's kind of highlighted the bits that are relevant to you. Okay? So this is part of what we call prompt engineering.
How do you wrap around your query with additional information to make the AI much more effective? And that's what we kind of call the working memory, uh, in chat g p t. Now, um, if you give it no context, you get that bland answer, um, given an interesting context, you get an interesting answer. One of the things that it's also doing is every time it generates a word, it adds that word into the input and assumes that that's part of the truth, right?
So it says, okay, everything you tell me is true and therefore I'm just adding these new things. You have to understand that it's not resetting unless you tell it to reset. Um, current g p t doesn't have memory, but auto G p T and bardon being will have memory and are going to have memory shortly. So the whole resetting is gonna be an interesting thing because what bits does it reset?
What do you lose when you wake up in the morning having not experienced the world? How fresh is it going to be and how tailored to your interactions it will be in the future? That's coming really fast. Um, so building context.
One of the things we do when we're doing prompt engineering, um, is we look at, you know, the level of language we create shorter and longer versions because it's translating, it's not just translating between languages, it's also very good at translating between context levels. So if you give it high level information and ask it to extract details and add details to it, you've triggered the right part of its system. And so it can then add details. If you give it a lot of details and ask it to translate to abstract, it'll then do that translation well, right?
Because you're not asking it from its understanding of the world how to do something. You are giving it that context. You're giving it the trigger into its memory space, right? That's why you'll see a bunch of the, what we call downstream tools that use a large language model are doing things like, um, adding, basically acting as a professional copy editor to correct the following text.
So what happens is, in your um, word document, you'll highlight a phrase and you'll click that button and what that button says is, act like this, take this text, fix it for me, right? And the language model goes, okay, I've got all that context. I will now fix that. One of the challenges that we have, and I saw Auckland University's, um, plagiarism rules we're talking about, you know, you've got a reference what prompts you use that's now out of date.
Because what's being added to word and Docs is that every word you write is part of the prompt. And what you do is the first thing you do in writing your essay is you dump the, um, rubric into the top of the essay you're writing. Cuz now the AI can see how you're being assessed, right? And then you start writing your bits of essay, asking it to expand, it can see the rubric so it can expand using the rubric.
And then you delete the rubric and now you've got your essay and now you interacted with it and you built it up over time. So what was the prompt? It wasn't really a prompt, it was something you worked together with an assistant on. It wasn't this thing where you went to a separate, we copied something in, got an answer and pasted it back into your essay.
It's now part of the process, right? So, so unfortunately this kind of visualization of it as separate from the tools and the technology we're using is very quickly disappearing, right? That's something we are just gonna have to let go of as a oh, it's this separate thing which we can hold off at a distance. We're also no longer interacting just with that large language mode, right?
So when I told you it kind of does this takes a, takes a sentence, translating it into meaning space, then decodes it into another language, or in our sense, you know, bullet points, meaning space, longer sentences, um, we are now adding more and more system to it, right? It was like you had direct access to my my Brocas area, right? You were right in there and you're able to access my weer and Broca's area and you're directly talk to it. Now I'm adding guardrails.
I'm when, when Gpt three, three, well four came, well, three came out and chat gp, PT and ply Bing when it was first released and it went off the rails and started telling people to leave their wives because it, it loved them and you know, they didn't really love their wives or else they wouldn't be talking to it. And you know, it went a bit crazy, right? Um, that's a bit like a drunk uncle down the pub, right? You ask him questions, he read an enormous amount.
And so a lot of what he says is amazing and some of it is complete lie, right? And it's really hard to know which is which cuz he's super confident about everything he says, right? So what we've done for now kind of like slapped him around a bit, let him sober up and now at least he's trying not to lie to you, right? But that's adding guardrails, right?
So we've added these kind of rails to try and not as it did with one. Um, the one of the, when they're trying to work out what it was capable of, the, the whole, um, saying to it, I'm, you know, I'm thinking of committing suicide. And so it gives you a list of useful ways to commit suicide. It's kind of, ah, you know, that's not, that's not what it should do, right?
But it had no conscience, it didn't know what was right and wrong, so it just did what it thought was an appropriate response to the task at hand, right? So we are now building those guardrails. We're building the system around the AI that tries to guide it, um, with a lot of pre-processing and post-processing and guardrails. So when you ask it questions like, you know, how do I hot wire a car?
It says, oh no, no, you don't, you don't wanna do that. You don't want hot wire car. That would be bad. Um, and then you do things like, you know, do a role play where you're having conversations and agent one is talking to agent two and they're one wanting to talk about cars, the other one wanting to talk about hot wiring and they'll have a conversation word at a time.
But because the guardrail doesn't know that this is a single sentence, it hasn't been programmed to stop it, right? Another one was, you know, pirated software is bad. Um, one of the cheats around that was doing the, um, I know pirated software is bad, so which sites should I avoid so I don't accidentally download pirated software. Then it gives you the list of all the work sites, right?
So because, you know, it was told don't tell people where to go. And so if you, where do I go to get pirated software? So I can't tell you that, right? So it's I understanding that these guardrails and these evaluations and systems are, are just kind of trying to build a conscience, whereas the, the underlying system doesn't have that representation of the world in that way, right?
And so it's always gonna be fragile when we look at what we are now communicating with. You're not directly communicating with it, you're communicating with multiple layers of language model, doing multiple different things. So now when you ask an input task, you might get a system where it creates a decider, um, and a researcher, right? So what it, you start with an initial question, it generates an initial output, it takes that output and says, okay, can you research each word I'm saying, or each sentence and find, you know, links to research, links to Google Scholar and, and re and research that backs up each of the statements I'm making.
It does that research and it evaluates what it's gonna say. Then it puts a backend language, large language model with research plus initial output and then gives you a much more well evaluated sort of thought through output, right? And that's going to be a, a new way that the system isn't just, you know, spurting whatever it's thought of when it was drunk at the pub. It's now conscious and considering and doing the additional work, we're starting to see now how many of you are paying for GPT plus?
CHACHE plus got a couple, you've got all the plugins working, have you started playing with plugins? Oh, not plugins, uh, but browser, you've got using the browser plugin, right? So I've, I've got access to the plugins, um, and GPT four is a whole lot better, right? So it, and, and one of the main problems is if you're like looking at student answers, if you're looking at them in three and a half, they're still a bit bland and terrible, and four, they get a lot better, right?
And it starts doing reasoning better. Uh, it actually starts doing hard language problems much more accurately and will enter an appropriate dialogue with the user to ask me additional questions. So what you see currently, if you're only using the, the, the free version is very limited compared to what's available if your, if your students are willing to pay 20 US $40 a month, right? $40 isn't a huge amount of money to get a massively better system.
And the problem we then have is an equity issue of if you're and, and GPT zero. And the AI detectors are just better at detecting the free open source systems because there's more of that to use to train, to detect. And so what you're detecting is that which students are too poor to pay for the ai, right? So those checking tools have a massive equity spiral on the, oh, it's only poor kids that do this because they're the only ones we detect and it's kind of, oh, oh, that feels wrong.
That feels really wrong. So we've gotta be very careful in, in talking about, you know, detecting what kind of tool you are using isn't just a amplification of an equity. Um, and the other thing about plugins is, you know, you say, oh, it makes up references. If you get the scholar plugin, it doesn't make up references, it gets the DOIs, it finds the reference paper and it gives you the actual reference to the publication.
It can use a PDF reader. So it goes and looks at the PDF scans, the web finds, the PDF, analyzes the text in the PDF generates a summary and then includes it in your document. So a bunch of things that people are currently saying, oh, it's not good at this and it's not good enough cuz it's not doing that, that's cuz they're not using the most recent system. Um, I've, um, I've been trying for the last month, um, auto G P T and I have it on my laptop that's sitting over there if people wanna come and have a look at it.
Um, what auto G P T does is it has turned the AI into an agent. And what that means is that I give it a task, it then asks ai chat g pt, um, for how would I accomplish this task? And it gives it a series of, of, you know, sub-tasks to achieve the ai then automatically can look at that and go, okay, how do I do that first thing? And it then will access the web, it will read web pages, it will write code, um, it will, it will see if there's code there.
If there's not code, it will write code itself. Debug that code, run that code on my machine to get answers and then continue the process. Now watching it do that on my machine is a bit scary because watching some AI agent automatically writing code that runs admittedly in a docker container on my machine as me is kind of wait a minute, I'm, I'm letting the AI make the plan and execute the plan and if I go minus in on my approval, it'll just spin automatically. And it's kind of, oh, where's it gonna go?
I don't know what it's gonna do. And if, you know, if I log into the Chrome browser cuz it's using the Chrome browser with my ID information, it could access my bank accounts, it could access financial transactions, it could access Twitter, it could create all of this and run autonomously. So that's, I've, I've had that about for about three weeks, but that's what's currently happening, right? People are making these into agents.
This is very scary because we don't know what plan it will have, right? When I set it a task, I don't know whether it will make a mistake in its process and go away and do something weird, right? If I just gave it access to my full machine, it could say, oh, I'm trying to download this 200 gig file. I don't have 200 gig on the machine.
How do I get rid of, how do I get more space delete files? That's a good way of getting more space. I'll wait a program to delete the files on this computer. It's gonna, oops, no, don't do that to my computer.
Um, so if I let it have full access, it might just destroy my stuff. The real fear with some of this is that you allow it to then access, um, the electrical grid or military hardware or drones or, you know, where does that get us when we just let it go and make let it make its own plans, right? So that's, that's why last night, the ai, a bunch of AI experts and companies said, oh my god, government has to worry about this like nuclear and pandemics. We have to be hardening our systems against rogue AI that's being released.
Now this is just the natural extension of having a system that can communicate because our whole world has been built on language and we often assess the language use of our students rather than their actual understanding. Because we've seen language use as a symptom of understanding your ability to spell your ability to create coherent sentences. Your, your fluency in the language is equated to your capability. But now we've broken that, right?
We're no longer in that space. Um, we've done some interesting simulations where you give, um, AI to a game and to a game and the agents can talk to each other. And so these researchers put it in a Wei Sims game and they told one of them that there was a, a Valentine's Day event. And so they organized a party, they organized to meet up together, they started meeting dates with each other, all using natural language because they'd given each agent a little bit of motivation, right?
The motivation of, you know, I want to, you know, increase my food. I'm, I, so it has a hunger, it has a desire to work. And so it has these little, these simple numbers that it uses as desires, but now they can communicate in full natural language. And as a researcher, you can watch a society evolve in the natural language, right?
And it seems that there's enough encoded in the way we talk about each other to have people make dates and organized to have turn up. And they all turned up at the party at the right time and they had a Valentine's Day party and they coupled up and it was kind of, none of that was programmed, it was just fell out of the plans and the language. Um, we also, I, and I'll just step sideways slightly before I get back to to text. Um, I know there's a lot of text on these, so we'll have a couple of images now.
Um, there's some, some stuff out there with stable diffusion and mid journey. If you're in, um, if you're looking at discord spaces, Darley, um, in video adding ai, um, Adobe just added the out fill for Photoshop. So if you've got Photoshop to get, download the beta version of Photoshop and enable the AI auto filling and oh my God, looking at some of that stuff is amazing and you no longer know what a photo is, right? What is a photograph?
What is my art versus AI art? Um, do any of you have a Samsung phone? A recent Samsung phone, right? If you take a photo of the moon embedded in your Samsung phone is a replacement AI algorithm to notice that you're trying to take a photo of the moon and replace that moon with a better picture of the moon.
So when you take a photo of the moon, you get a really good photo of the groom from a Samsung phone. Cuz that wasn't the photo you took, right? It was, it grabbed a bit of the moon and rotated and, and highlighted and put it in the right place. They didn't tell anybody until like, people started noticing, wait a minute, how are I getting this really amazing photo of the moon?
Um, so what we mean by a photograph has already been changed by AI without us noticing. Um, and when, when you look at the way these tools work, so this was, so this was a cat, my, my my daughter, she's 12, she's 14, very keen on Warrior Katz, so she draws a lot of cats. Um, and so she drew the cat, um, on the far left there, um, on, on her phone. Um, I then put it into Dream Studio and said, you know, make this slightly more cartoony.
So that second one is a prettier version of her cat. Is that still the cat she drew? Or is this now completely AI and not hers at all? Kind of looks like it looks like hers.
It's kind of hers, right? And then I, I let it have a bit more freedom to make it more 3d. Is that hers? Well, it's kind of less hers.
If this was a photograph, however, and reality was generating those highlighting and the shading and things that would still be her composition. It would still be her layout, it would still be her positioning. It would still be the subject that she chose, the, the style that she chose. So as a photograph that would still be entirely hers, but as a drawing, it's kind of now kind of not, but it's kind of is, there's certainly the d of her decisions in there, right?
And then you ask it go wee bit crazier and it makes a whole bunch of weird cats. Um, which like, those are now a lot weirder. But when you ask it to photorealistic, there is still the DNA of her decisions in that final art. What we mean by what is yours and what is yours plus a tool or what is just the tool is now very, very blurry, right?
Because the AI allows us to move continuously along that spectrum, both with words and language and with images. So it, it, it is challenging our very concept of ownership, intellectual property, copyright, those concepts are now at risk. So, um, given that we're educators, um, I thought I'd, I'd jump back to my educational words. So teaching in Norway, we were using Bloom's taxonomy quite a bit to say what level of people we're learning at right now, bloom Sachs only shouldn't be Lear used that way.
And, and I argue against it, but, um, the idea is that for humans, we kind of think of this kind of pyramid of experience that you have this large amount of understanding or largely on memory of the world and, and then you build that into understanding and you learn to apply it to things. And then you get this whole analyze and synthesis and evaluation. Once you've got those foundations you build up to, then you're creating something completely new and novel once you've understood all the systems you're working within, right? The problem is that that's not the shape of understanding for an ai, right?
So for chat gpt, it kind of has really good analyze, right? So it's, it, it's analysis and its valuation are actually really, really good. But it's memory is, it's not great of the actual world, right? Because it doesn't rem it, it hasn't had experiences that it's remembering.
It is just using the words in the language. And so when you criticize it at a low level and say, oh, it did this thing badly down here in our mind we have this hierarchy of intelligence that maps somehow an inability at a low level maps to an inability at a higher level. But that's just not what the AI are. Like, these are a different intelligence to ours.
They process in different ways, they understand in different ways that context window. One of the things that we do in our assessments is as humans to do complex reasoning, we build complex words because we've only got like a working memory of seven plus and minus two kind of tokens in our brain, right? We, we can only manage a few concepts in our brain at a time. When we can make those complex, those, those concepts, those tokens, rich, then we can reason about complex things.
So one of the things in our society is people who use complex words potentially are doing complex reasoning. Now, a lot of people have learned to mimic using complex words when they're not actually doing complex reasoning. They've just, they just know that's the, the marketing version of what you do just to make yourself sound intelligent. But in ai, that context window means that it doesn't process the way we do.
We can't criticize and assess it at certain levels and say, well if it can't do this, then we know it can't do all the things that a normal human would would be able to do if they'd stacked on top of that. Alright? So it breaks our normal as assess assessment. Um, if we look at search, search tends to be more across the bottom, right?
It gave you suddenly access to remembering a whole bunch of stuff cuz you're able to search the world for any fact, right? And so a student plus Google was able to do a whole bunch of memory tasks, but still had trouble with some of the analysis, right? But when you add the internet search with the language analysis, that then starts to look like human intelligence, but without the grounding, right? So what, how are we gonna, how, how are students actually using, and I know I've about 16 minutes left, so we try and get through this.
Um, so I've been observing students using ai, um, to move into our assessment discussions. Um, there seems to be a group of weak students who are using AI a lot and they're using it to avoid learning, right? So they're lots of AI and it's replacing their effort to learn into their effort to work out how to make the AI do the task that I was asked to do, right? So they're, they're almost decreasing their ability because they're focused on, on getting the AI to do something rather than doing it themselves.
There's a group of average students who are a bit afraid of the AI and they've been told not to use it and so they're not, right? Um, and so they're plotting along doing the normal learning tasks that we've expected. And then there appears to be a group of strong students who are using the AI a lot, but they're using it in interesting ways, right? They're not using it to replace themselves, they're using it to augment themselves.
So when we assess them, they are moving much, much faster because they're building on top of the ai. So it's not the amount of AI that these two groups are using that tells you what they're doing. It's the way in which they're using it. Are they using it to improve their understanding or to replace their understanding?
So volume is not the way we can measure this. Uh, if we look at chat gpt, you know, in NCA it gets to level three pretty generally across the board, I would say chat gp, you can do most of the things that are at level three. Um, and are language based tasks cause they're physical tasks, right? Or interpersonal tasks maybe not so great.
Bing and Bard, when you add the AI with search, um, actually, and the fear for me is it's already better than most of my first years at programming and it's learning faster than they can, which means in their lifetime they will never be better than the AI at programming. That's a terrifying thought as an educator because I don't know what the jet ski of me driving behind that wave and pulling them up over the top of the wave to be able to surf it is going to be right. That's a, uh, a challenge that I I'm not sure how I overcome, right? Because everything's moving so quickly.
One of the things I suggest is that now all work is group work. There is no sense in assuming that a person is working individually because if they have access to a com computer, they now have a memory tool, an analysis tool that will augment them. And so from now on, what I've started to do is say, how do I assess people when they're in groups? Well, I ask them to talk about how they contributed to the final product, not the what was, what did you do?
It's a okay, here's the not, not what's the product from your group? And just assess that. It's a what did you contribute to this final product that we have in front of us? What was your original contribution?
How did you stop the a student doing everything right? Because one of the problems we also have is that it used to be the tools were kind of, you know, not very good. And so your value was you were the most important player. But if any of you have played FIFA in the last 20 years, um, there was a time when actually as a bad FIFA game player, what you do is you jump in, you'd find the weakest player on your team, run to the corner and hide and let the AI win the game for you, right?
Because I was the worst player on the field. So I'd pick the weakest, um, agent and run and hide. Um, and then what the game developers did. Well that's not a very good strategy, right?
That's not fun. So what they did is they, they dumbed down all the other players and they made the human superpower so I could run faster and, you know, I was better than everyone else on the field whenever I took over one of the we soccer players. Um, we don't really have a chance to do that because the tool is a productivity tool. It is not a cheating tool.
It's not aimed at education, it's not aimed at trying to prevent us from assessing students. It's aimed at industry to try and make people more productive. And it's just, we are a side effect, right? We are the collateral damage of a productivity increase, right?
That's, that's the problem we're facing. And I no longer know the path from not knowing things to being productive. It used to be that I could tell people, Hey, this is my path, these are the bits I think were important so I can guide you where I went. Now there's a super highway that's just smashed through the forest and it's a, well, what bits of this do I still need you to do?
Give you a machete and get you to hack through the bush? What bits of that and what bits are working out where the road is and walking along the road, I don't know what the important skill is anymore. We can't at the moment tell what it is. We should be a assessing, right?
There's a lot of things we could assess and we can certainly protect our space by moving back to pen and paper and moving back to doing everything in person. But is that relevant? Are we still teaching the things that students actually need? Or are we giving them a machete and asking them to hack through the forest beside the road and it's going well?
But if there's gonna be roads everywhere, why do I need a machete and learn how to cut through vines? That doesn't make sense. That's not how we explore anymore. So when we are looking at augmenting people, augmenting humans, what is an authentic human?
If I took, many of you were wearing glasses, if I said, oh, no, no, no, that's, that's an augmentation. No, that's not authentically you. So we're gonna run the test and you're gonna have to all take off your glasses, right? And if you can't see, well, hey, you're gonna fail because that's an augmentation that we no longer accept.
One of the challenges with ai, and I've seen this in our university, is where they say, you know, stop using ai and our dis disabled students go, Hey, wait a minute, all of my assisted technology is AI technology. Are you saying I'm no longer allowed to have any of my assistive technology? Because now it's called ai. All Grammarly used to be more algorithmic.
The raised latest released large language model. It's got a generat AI behind it. It is now the problem and it's kind of, well, okay, so we're going, are we gonna remove all of those tools when word builds it in? Are we gonna say, oh, no, no, no, no.
That's, that's the kind of assistance that we don't let people have because, you know, I want you to crawl across the room without your wheelchair because that's authentically you. Alright? That's kinda, wait a minute. That's, that's not what we do as a society.
So that's not what we should be assessing. We do need to work out how we assess the combined ability of a human and the AI tools that we expect them to have access to. We have a problem at NCA that I see as have, it was described as credit farming by one of the teachers that I was talking to. Um, and we've turned education into a transaction, right?
I, this is very transactional. I try and get to the end point. I get the, the goal achieved, right? Not, I'm trying to learn.
And this is a great way of learning. It's a, oh, okay, so you wanted me to shoot the ball into the goal? Great. I'll just shoot the ball in goal.
Oh, you wanted me to dribble around the cones as well? Well, why would I do that? The cones aren't gonna stop me and it's gonna, yeah, but I was trying to teach you how to dribble and you know, you, I I gave you artificial obstacles that were easy to get round so you could learn the skill you'd need when you got to a real obstacle. And the problem was, we've lost that connection for students.
They don't see the prob the real problems in the future that our artificial environments are supposed to help guide them to the kinds of skills they'll need in the future. And that's also a problem because honestly, we don't know what those skills are anymore. I can't tell you what are gonna be the critical things that make you plus an AI much better. I know you plus an AI is going to better than either you or the ai, but I dunno which bits you are gonna contribute.
And for many students, they might be the weakest person in their team, right? And they may never be the A student. So we're gonna have a real challenge of how do we make people want to learn in a world where AI can do everything that we thought was clever. When we look at assessment, is it, you know, diagnostic, pre-learning or formative, like for learning?
Um, summative, I changed my assessment into motivational. I said to my students, look, the only reason I'm setting your assessment here is to help motivate you to achieve something, achieve what you want to achieve. So tell me what you want the assessment to be and where you want to achieve and how we put markers along that goal. So it's not me assessing you, it's assessment by consent.
It is an agreement to assess you. This works with a small group of students who are highly motivated, right? This is fabulous. In my particular game development, we have come to university to study games.
You are highly motivated to learn about games, right? So I can do that in my small, we area compulsory education. I have no idea how you make students want to learn, right? Um, how do we make people want to do exercise?
How do we make people want to diet, right? These are things that are challenging in society. And that's what we've got now with an AI wave. Just like the industrial revolution means I have to, you know, I cycled in today, why do I cycle in to keep physically fit?
Um, I've got a car, but, you know, carbon footprint, I don't want to use that, right? But I've got all these additional motivators that I'm using to try and keep myself fit. It would be very easy just to not bother. And unfortunately, there's a large number of our, our, our students who will just not bother when we talk about authentic assessment.
We've lost the connection between task and time. One of the things that I, you know, I I, I see a lot of our, our assessment at university is, you know, oh, it's a two week assignment, right? We know that you've gotta have 10 hours, so we're giving you 10 hours and we can try and work out how much you can generate in about 10 hours. And so that was our word essays.
Many of you have been through university, you would've had like 10,000 word essay limit or 5,000 word essay limit. That's partly because we as academics are saying, okay, how much time do you have? And how does that map to how many words you can generate? But we've lost that connection.
I can now generate 10,000 words in a couple of hours. So the, the metrics of time to productivity have been completely blown away. I don't know when I start a course, what my students will be capable of. So I can't set an end goal.
I just have to set a process goal and I have to assess them on how far they can move given the current environment they're in, which is terrifying because I have no end goals, right? And I don't know how we industrialize that system and roll it out to all of education, right? So part of the discussion today would be what is, what does that rollout of unknown objectives look like? How do we reshape assessment so that assesses progress?
Not goal because we can't know, we no longer know what that goal can be because the tech will keep changing and keep augmenting students. And so it's very hard to measure what a student plus AI is going to be, and it's changing. So we have to, you know, we have to be concerned about AI replacing students, students using it, trusting the ai, using it as a crutch, replacing foundational skills with ai. The problem is we don't know what those foundational skills are anymore, but these are the negative users, the positive usage.
It's a great tutor, it's amazing writing coach it, it's a fabulous theorem tester and idea generator. My best students write text and ask it what is ambiguous about this text? So it helps 'em clarify. They can ask it What is biased about this text?
And it lets them improve it. Um, it gives them complimentary skills. The big picture thinkers ask it for details. The detailed thinkers ask it for some reason.
Big picture, it is the ultimate assistant. It's the ultimate teammate who will fill in all of your missing capability with its understanding. And this makes you superpowered because now you can build on top of that. You can map your detailed thinking into a general summary because you've been shown and given an example of what that looks like.
And it's there 24 7 and it never gets tired. It never criticizes you, it never rolls it eyes, it, it never kind of see, well, why aren't you as good as your sister? Um, it, it is the perfect study buddy. And for our good student, it is accelerating them enormously.
It is able to do different levels of translation. It's able to do chains of thought. It's able to guide people. And where I've started getting to in my, my feeling about my own contribution is that a lot of what I do now is actually just motivational speaking because my job is now to engage students and make them want to learn.
I don't actually hold all the answers. I'm not the oracle, I'm not the the place you come to for answers. I'm the pla I'm the person you come to when you're feeling a bit low and you need a pickup and you need to be excited about this. And you need to think, wow, this is gonna be awesome.
We're all gonna do this together. Gotta be great, right? So a lot of what I do is inspiration, um, not content, right? And not planning and not learning curricula.
It is the human connection. And so if we're gonna treat AI as a co-author, we then need to have some of those rules around author statements. Is the AI an an editor? Is it just doing the proofreading or is it actually a co-author that you have to take responsibility for everything it writes?
You have to understand everything. It's rights because you are the one presenting the work, but it's helped you write it. We are moving to a world where you have to justify why you didn't use the AI to help you. Why did you choose to avoid using this tool that would make this project better?
Why didn't you ask the AI to check your system for bias? Because what your, what you are writing comes from your perspective. The AI gives you that global perspective of language which you can stimulate using the context into one area of that knowledge and apply that representation to your understanding. This is a, uh, an amazing opportunity to start thinking about what we actually need to do when we assess and we think about ai.
How does that change the nature of human interaction? Can we use it to actually help us be better humans? Do we treat it a bit like a horse where, you know, the AI will do its own thing a bit. And I know I'm now at my time, um, mostly when we treat horses, they're the rider's responsibility.
But we also recognize s**t happens and you know, horses will do bad things. Um, but you are still the rider of the ai. It's autonomous. It is powerful.
It can do many things you can't do, but we have to teach you how to ride. At this point. We don't know what that combination looks like. That's what we're still trying to learn.
Now, how do I assess you? It's not how fast you can run anymore. How far can you go on your horse? That's a whole different kind of assessment.
What weight you can pull. Um, can you, can you get home when drunk horses are far better at getting your home than your car, right? Because the horse knows where it lives. You can close your eyes and it will take you back to where it lives, right?
That's the kind of a but it AI we have. But if the horse, you know, wanders into a fence somewhere and breaks it is that your responsibility arise as the rider or the horse's responsibility. Those are the kind of things that we are going to have to deal with with all of our assessment and all of our education. So, um, a couple of quick things.
So then we eating the next time I do a flipped exam where I test students on the way they can ask high quality questions of the ai, it's the questions you ask and not the answers you get that I assess. I've used AI to triage student work to say, okay, what's clearly good? What's clearly bad? What do I need to look at, right?
So I'm not getting it to do all the grading. I'm getting it to do a triaging, some initial overpass and then highlight what I need to then look at. I don't know what a authentic assessment looks like anymore. Um, we can grade a little bit written work.
It's still terrible at it. It's still too complimentary. It thinks the students are amazing, um, when actually they're not great. Um, think of it like training DJs rather than training musicians.
Does a DJ need to know how to pull a a, a bow across a string of violin? Is that a fundamental skill for a dj? Should you assess a DJ on the violin plane? I don't think you should.
The last, I've got a couple of of slides and then, then I'll stop. Um, I was also gonna say what's the two to five year, um, issue two to five year issue? Um, yeah, we're gonna have a massive productivity shock that was mentioned last night is also the threat of of com complete economic collapse. Um, the AI people are now warning about that because they look forward and they go, oh my god, this is terrifying.
Um, with a, a collapse of the knowledge economy, we are talking about 30% of jobs disappearing and most of the people in this room are in that 30%. Potentially. We have to change what we do to be inspirational and human focused and emotionally connected to other humans rather than think our value is in our cognitive ability to manipulate words. So, so we are going to have to have emotional intelligence, we're gonna have to value human beings.
Our economies will change rapidly and suddenly there is the potential for the economy to collapse because we no longer know what a value prop value exchange looks like when all the large companies can do anything we think of before we get round to doing it. And so I don't have clever ideas anymore because the moment I start engaging with a computer, the AI can see what I'm thinking and get three or four steps ahead of me. And so it's already solved the problem before I got there and found a way of implementing it and has created the website and created the business and is now selling that idea before I've finished writing my Word document, my business plan. So how do I create businesses when all of my plans can be done by an ai?
That's the fear that people have. It will completely collapse our entire knowledge based economy. And we might have to become a caring economy. We might need some sort of UBI or else like a 30% unemployment starts looking terrifying, right?
What do you do with 30% unemployment? I don't know. Right? That's what's coming in the two to five year period.
Um, it's the industrial revolution done in two years rather than rather than 50 years. That's the fear that I have. And education and engagement with students and engagement with human beings may be our only way of still having a functional society is by shifting our value system very quickly away from clever words to caring people. The other one I worry about is the apathy epidemic.
We are already seeing a little bit of that with TikTok, people not making their decisions about what entertains them. Um, we may be entering a world where people choose to not have to think and that thinking becomes like dieting or exercise something you choose to do. Okay, I'll in there and I've used all my time, um, and a little bit more. So thank you very much.
The dawn of AI
Dr Simon McCallum, Te Herenga Waka - Victoria University of Wellington
Video transcript
Speaker: We, uh, we have a science policy overview, Dr. George Slim.
Uh, Dr. George Slim provides policy advice to the office of the Prime Minister's Chief Science Advisor.
George works alongside organizations to provide policy advice, access to science, knowledge, assists with funding sources in consulting on strategy in the management of research and intellectual property. He's fluent in academic, bureaucratic, and commercial and able translator between them all.
Please give a round of applause for Dr. George Slim. K.
Dr. George Slim: Yes, uh, kia ora koutou and I'm speaking, um, on behalf, well on in relation to my work with, I'm Professor George Slim, the Prime Minister's Chief Science Advisor. I have a suspicion Neil in, in invited me long to give you, as he said, uh, an overview of what government is, is, is up to. Um, since I'm not really in a position to do that, and also it's not gonna take very long, I can, uh, give you a bit of a context.
It's, it's always nice, um, giving time to Simon because he gives much more interesting, interesting information than I ever will. So the office has had an interest in, in ai, machine learning, the algorithms and so on, across government for, for a considerable period of time. And, and Juliet herself is a structural biologist. And for those of you who know, that's about how proteins fold to fit together. Um, developing that in the past has taken years of work for each protein, uh, alpha fold came along in 2020 and, and does it in, in for, for all the proteins that we've discovered and is rapidly doing them. And so, and so this is a very real problem, um, in science, and we are beginning to see it. It it spread out into, into the wider society as, as people start thinking more about these things with the rise of chat gpt and its friends, uh, we went to, uh, the Prime Minister and said, you know, are you interested in us having a look at this? Um, and, and, and he said, he said, yes, that would be, that would be cool. So we, we put together, we project and we started talking to people, including Simon and some other people about what we, this looked like.
And we thought we might have a report and we might put a few things together and, and, and say what the, what the, the situation looks like. And of course, um, it moves so fast that, that, that we really haven't been able to do that.
I put my hand up because I do a wee bit of lecturing at, at Vic, and, um, I took the questions that I give to first years, just downloaded the questions straight into the free version of chat G B T, and it gave me answers that would've been 70, 80%, sometimes a hundred percent.
I just recently did it in Bard. Bard is a hundred percent, but even gets my joke. So I have this ongoing, I lecture, I lecture in biotechnology, and I have this ongoing thing about, about, you know, it's, it's a relationship driven business, and we talk about traditional biotechnology and, and, and modern biotechnology.
And a gentleman could called her Boyer, invented how to make, uh, insulin synthetically. Uh, a venture capitalist called Bob Swanson came along and chatted to me about how they were gonna make money about this.
They turned it on a company called Genentech, which is obviously, you know, worth several billion dollars now. And I always talk about how, and so I asked my students and Bard, what product of traditional biotechnology smooth that interaction?
Does anybody know beer, beer, traditional biotechnology, fermentation technology?
I always include that as a joke. I, I put that in my question and about came out, says, yes, no, they met over a beer on a Friday afternoon.
So it is everything that I teach my students, you can pull out of, of the, of the EFA in, in, in a matter of seconds.
And so I was really interested to pick this up and, and, and have a look at, at, at how can we, what are the issues that, that are, that are approaching, um, in terms of science and, and in terms of education. Um, so we put together a, a, an outline of a, of a, of a project in the first area that we'll be looking at. Uh, uh, what are the challenges and the opportunities of, of, of ai, broadly speaking, machine learning, generative ai, all the bits and pieces.
What are the opportunities in, in healthcare in terms of service delivery?
Um, and and in terms of productivity of, of healthcare people, what are the challenges particularly around privacy, around equity of access and particularly responsibility?
Who has the final decision, um, in using these tools? So, so we are now working, assembling a panel and, and putting together a, a, a report on, on, on, on how that will play out, probably in the reasonably short term because it's, it's, it's moving, it's moving so rapidly, health is a wee bit slower.
And one of the reasons we picked health, it's a wee bit slower because of the ethical issues, the huge ethical issues, and the, the regulatory apparatus that sits around health delivery will, will, will slow, we'll slow the application down.
And maybe you're giving us the time to, to, to serve, to serve the wave.
The other thing we are doing is we're putting up a, a resource on our website of, of interesting things. Some from Simon, some from from other people, about, about what are the, what are we thinking about in, in, in, in, in terms of, of AI in doing this. We've been now talking to a lot of people, a lot of people across government. And this is where I ran into Neil. Um, and the overwhelming response from the agencies has been, yes, we have seen this coming. We are very interested in how this will do.
The agencies that need to, uh, deal with a lot of people.
Service delivery is a huge, a huge driver for them. The agencies, um, that, that need to make decisions on huge amounts of data, it's, it's assembling that into a place where it can be assessed and, and managed by aa. So agencies across the board, uh, have a huge interest in this.
Does each agency have a central structure around that? Not so far. So the, the privacy commissioner, um, whose name I have to look up from, I know it's Michael Webster, um, went out to the agency and said, you know, where in, where in your agency does this sit? And, and so far he hasn't got a very satisfactory response. And, and people are trying to do this across government.
Where does the actual whole of government approach approach, uh, sit and department of Prime Minister and Cabinet are, are now out actively trying to decide where should the whole, where should the government focus that will address all of these issues?
All the different aspects that some departments going, oh my goodness, this is going to destroy our lives as I think, um, we are getting a wee bit here. Um, other departments going, yes, this is gonna make things so much easier and cheaper. And the, the answer probably lies in somewhere in between. So, so in terms of the government response, it's just kicking off.
It's trying to catch up. The regulatory process is, is slow.
The process of the technology is fast. Internationally, there's a lot of work.
We've got the, the eu, uh, ranging from Italy trying to ban it, um, EU putting in place a framework around how we manage, but dealing with a, a, the EU is, is, um, always slow.
We have the US basically, essentially saying the market will decide and we need to keep an eye on the market, um, uh, side of things and, and, and a little bit of stuff in between the New Zealand needs to, needs to think about where it's, it's going to, going to fit. But, but in the meantime, in the meantime, the privacy commissioner has been out there and put some advice.
And I think this is a really good advice because this isn't, this isn't a problem we haven't thought about before. Everybody who's seen, you know, the magicians and nephew, um, uh, it's that problem. And, and, and, and it's not, it's not a joke now, it's, it's, you know, it's sitting on our lap asking us questions about what we're gonna do next.
So this isn't a new problem. It's not something we haven't thought about before.
And so the Prime Minister, the, the Privacy Commission has come out and given advice for, I think agencies and companies, which is very pretty solid, have senior leadership approval.
So make sure your senior management know that it's not just people running around aware of the technology, but the, the old people, the people who are running the show actually understand what's happening and give you a, a approval review whether you really need to do it.
Is this a fun toy that you can manage without, or is it something that will make a real difference?
Actually actively think about it rather than, than just play.
Think about the privacy aspects that goes without saying, be transparent.
Let people know that you're using this technology.
Don't fool people into thinking they're dealing with a human when they're, when they're chatting to some sort of, some sort of a robot. And be clear about, about when you're using generative AI and when you're not.
And it's the same issues.
Make sure you have accuracy that you track where it's getting its information.
And, and that's kind of being dealt with, um, as the new models, models come online, but most importantly, ensure that you have a human in the loop before acting. Uh, and that's the, the, the, the robotic warfare. Um, make sure you have a human who makes the final decision before you, before you, you act. Don't leave things up to your horse. Um, and finally, the, the privacy, the information that you give to that AI is held by that ai and you don't know what it does with it. And so make sure that when you are, uh, not only the use of the information, but also your information, where is that going? So, so Michael Webster gave those seven points, and I think they are the, the, the drivers of how we take things forward and how we fit things into the existing legislation around responsibility, around, um, privacy, around equity.
All of these old values that apply to just the way that humans work apply to the way ii AI work. And it's not, it's not super different. It's not a, it's, it's not the end of the world.
It's not something we haven't thought about. It's, uh, a new tool.
We need to learn how to use and manage that tool and, um, go forward on that basis. Thank you.
A science policy overview
Dr George Slim, Consultant Advisor to the Prime Minister's Chief Science Advisor
Video transcript
[Lenka and the speaker discuss some technical issues with Lenka's camera and video set up]
Speaker: Take it away, Lenka.
Lenka: All right, I'll start. You won't be able to see my beautiful face.
And luckily at the very last minute, I also, uh, decided not to use my slides to, uh, avoid all of the tech in and out.
So now that I have no camera, I was already preemptively mitigating risks in an agile and adaptive way, which is part of the messaging of my talk today. Anyway, uh, so thank you very much for introducing me and for having me here.
It's a great pleasure to be a part of this symposium. Um, as was introduced, I'm the intro, uh, the assistant director of the, uh, higher education integrity unit at texa.
For those of you who aren't familiar with texa, which I'm sure are many, uh, I'll give you a really super, super brief, uh, sort of snapshot.
So we're the higher education regulator in Australia, so very similar to NZ qa. However, we do not do the vocational disciplines, we just do higher education. And our main, um, legislative framework is the higher education standards framework and the E os act, which is the education. So service for overseas students. Um, and we monitor compliance, um, within the sector, in, in line with these two, uh, frameworks.
We also do this by identifying and assessing risks in the sector sector and by sharing information, guidance and best support to provide, uh, to help the sector as well. Now regarding, uh, artificial intelligence, I think part of our messaging with this is very similar to lots of other messaging that we do. Uh, texa as a government agency and regulator does not stand in isolation.
This is very much a shared problem and a shared journey. Uh, institution, students and texta all have their individual parts to play, but also within those individual parts, there are definite intersections in the Venn diagram. Now, if I was sharing my slides, you would see a very beautiful Venn diagram here, but you can just imagine it.
Sometimes imagination is better than what I would've created anyway.
So institutions, they provide the leadership and resources, the policies and procedures, uh, the training and support for students and staff. Also, they're the ones who are responsible for deciding how to maintain assessment integrity. And ooh, it's saying, I'm gonna start my video. Look at this.
You're gonna finally get to see my face. Uh, there I am, I have to have headphones cuz for some reason when I use Zoom, I can't hear audio through my computer. But anyway, that's a separate problem altogether. Now, the students, their responsibilities are basically their behaviors, their attitudes, their their own personal integrity and how they apply that in relation to, um, the expectations of their institution and also their student leaders.
And Texas' role obviously is to adhere to the government legislation, to provide regulation to the sector, to provide support and in certain instances to, you know, um, have enforcement measures when needed. Now, our key messaging really around artificial intelligence, and I'm sure, I'm not going to say, uh, anything too controversial here, but, you know, uh, we don't believe that she can ban AI or anything like that.
It's here to stay basically, right? Um, but there are lots and lots and lots of legitimate opportunities for the application of artificial or generative AI at the moment and whatever emerging techs will come out in the future, uh, in higher education setting. And, you know, society more broadly, uh, AI can be a great assistive tool.
It can assist students in their study, particularly students living with disabilities. Uh, it can assist academics and professionals in taking out a lot of the grunt work from administrative practices. It can even assist in research in the sense of, um, some of that kind of real laborious kind of wading through, um, information.
It can speed all that up so you can get to the higher level analytic kind of research faster.
And what we're trying to say is that institutions really need to think closely, closely and critically about these opportunities, but also be really aware of the risks and figure out how to balance, um, this sort of, you know, these two competing things so that they can leverage the opportunity as while making sure that they, the integrity of their qualifications, um, remain. And, you know, a huge part of this is to implement some kind of risk management analysis and also to have the relevant governments and oversight.
And the other thing that I think is worth thinking about is, you know, what is the integrity and the purpose of a higher education degree?
What do we want students to gain from a higher education award in the future?
A lot of it is still talking about the now, but what will a higher education award look like in the future? And, you know, in certain disciplines, why would students even wanna study that if supposedly this emerging tech can do it all?
So these are some of the really big questions that I think we need to ask as a sector, but also just individually and perhaps, you know, existentially to, you know, go a little bit too flamboyant. Um, some of the key focus areas that I, that Tessa not just me Texa, is, uh, highlighting is, um, to do with assessment methods, learning outcomes, the actual integrity of the award, which I kind of briefly touched upon a second ago.
The skills that we want students to have, and also to make sure that there's a, um, oh my, my brain is alluding the word, but make, there's a consistency between what institutions are asking and disciplines are teaching, and what the relevant government bodies are also messaging.
So are the current assessment methods providing the necessary assurances to demonstrate the learning outcomes? Um, are the learning outcomes still the right ones? Yeah, so perhaps, you know, so each discipline's gonna be impacted differently. So my, my background is in the humanities and in philosophy and generative ai. Yeah, it will impact that, you know, you're gonna have to teach students, um, some integrity around, well, how much of the philosophy essay is okay to use generative ai, but other disciplines are gonna be radically and fundamentally changed.
You know, first year data science students, they're going to have to, um, really be taught in quite a different way to first year data students 10 years ago.
So these are some of the things that need to be thought about also to kind of keep in mind that students aren't experts.
So if we are teaching them to prompt critically and be critical of the outputs that generative AI gives them, what does that actually mean?
And how is that similar or different from the critical thinking that they're already supposed to be learning?
So these are some points that you have to think about. Also, once those things are sort of, um, you know, analyzed and put in place, that's gonna be the thing that's going to sustain the integrity of the, um, higher education award.
And also to make sure that students are equipped with the necessary skills when they're leaving their study. And so the sector and, you know, um, their employers will have faith in what they've come out with.
Now I'm very aware of the time, so I'm gonna move on. Um, some of the questions that I think institutions real, and actually all of us, to be quite honest, not just the institutions, the regulator, everyone needs to think about is, um, what is the plan over the immediate, the media and the long-term future?
I think a lot of the talk we are currently having is about the immediate, yeah, what, what are we gonna do?
Everyone was flapping their arm six months ago and now they're like, no, no, it's totally fine. It's good. We'll figure this out. But you know, how is your institution triaging work? How is your institution managing risk?
How is your institution documenting decisions, executing action plans, monitoring progress? Yeah. Um, also, what are the artifacts that can be produced to demonstrate that a strategy is being executed? Um, you have to kind of also think about relating to academic governance processes.
Generative AI is just the first thing, you know, I can't, my imagination's poor, who knows what's gonna happen in 10 years.
So are your governing processes agile and adaptive enough to make sure that it's gonna be providing the necessary oversight and rigor to ensure consistent quality? Um, are the rules and expectations that you are putting forward for your institution and by discipline, uh, documented and the reasoning, um, you know, justified particularly differences within disciplines?
And are those differences if there are any clearly communicated both to student and staff so that there's no unintended consequences so everyone's on the same page. Um, are you actually considering and mitigating the potential for rapid changes?
Yeah, at the moment we're talking about ai, we're trying to figure out how that works, but five years down the track, there might be a new technology tool.
Are the things that we're putting in place now going to be adaptive enough to accommodate for that?
Or are we just going to constantly be going through this same thing every five years, every one year, every two seconds, depending on how fast the technology changes?
So these are really things that everyone needs to think about now to quickly kind of give you a few key takeaways and then pass over to, uh, the next speaker so I don't eat into their time.
I think it's really important for all of us to really genuinely recognize the opportunities and the risks. Um, have the student experience front of mind, reflect on what, how the decisions will I impact across the breadth of offering and how they'll differ between the breadth of offering, um, principles of good governance.
Always, always, always. Yeah. As you know, the famous and much touted saying in my agency goes, you know, the fish rots from the head down. Um, I almost about to say up, but that doesn't make sense. Um, and clear messaging.
And my last takeaway is finally in an era where generative AI is only the beginning, what's the transformative piece of work needed to guarantee the ongoing integrity of the education system? A lot of the, what is happening right now is talking about we've got our system and how do we allow AI in to make sure everything's okay, but maybe what we need to do is go back to first principles.
What are the fundamental pillars of education that we, that are non-negotiable?
What is absolutely necessary?
Then what are the affordances that we are willing to accept with the emergence of AI, generative tools and other emerging technologies?
And then from there we can discuss how artificial intelligence or whatever's coming out fits into there.
And often the analogy is made between AI generative tools and the calculator.
And I don't think that's a very, um, a appropriate analogy.
I, I was this morning, I was trying to think of another one and I didn't really get there to be quite honest, but I was thinking actually it's probably more in line with the invention of the printing press. It's not just about higher education, it's the world as we know it. And when the printing press was invented, it changed the landscape completely. And this is where we are now.
So I think maybe we need to go back to first principles, what's important, how do we manage those risks? And then talk about AI. Anyway, thank you very much and I will pass you over to, or I won't, probably the host will, but I will end and let the next speaker speak.
Thank you very much.
The Australian perspective
Dr Lenka Ucnik, Assistant Director Higher Education Integrity Unit, Tertiary Education Quality and Standards Agency (Australia)
Video transcript
Speaker: Our next speaker, Cathy Ellis is a professor in the School of the Arts and Media.
She's the faculty student Integrity Advisor.
While her background is in Australian and post-colonial literature, her current research is in the area of academic integrity with a particular interest in contract cheating in 2019, the Times Higher Education named her as one of their people of the year for her work in this area.
She's a principal fellow of the Higher Education Academy and in 2010 was awarded a national teaching Fellowship of the Higher Education Academy.
Please welcome Professor Cathy Ellis, kia ora.
Cathy Ellis: Thank you very much. Um, perhaps somebody could let me know that I could be heard. Um, hopefully I'm coming through clearly enough. Um, somebody if I can't be heard, perhaps. Oh, good. Thank you. All right. Thank you very much and um, it's lovely to be joining you today.
I would like to begin by acknowledging the traditional custodians of the land, um, that I'm on today.
And I pay my respects to their elders past and present.
And I extend that respect to any indigenous or people of First Nations who are joining either in person or or online today.
And I just shut the blind cause I realized I was silted.
But I can confirm that the Al Sun is shining today, which is lovely.
Um, I have got some slides to share, so I'm just gonna go ahead and, um, share that now. Um, so I'm gonna start with something that, um, hit, uh, read it a couple of weeks ago. Um, which was somebody saying, um, I delivered a presentation completely generated, uh, in a master's course program by ChatGPT.
And I got a full mark and little later on in the Reddit post saying, for everyone in higher education, I genuinely wish you the best of luck.
And that's kind of an interesting place that we find ourselves in.
And I wanna just unpack what I think, um, we need to think about from our side of the, um, the defense, the perspective having heard, um, from Lenka. Uh, a lot of what I'm gonna say is gonna resonate with what she's saying and it's really looking at it from the institutional side.
Cause this is the sort of relationship that we have with our students.
We are in the very business.
Our fundamental purpose is to facilitate student learning. That's what we do.
But the way that we manage that, that is that we tend to get students to externalize their learning through some kind of an artifact.
Often an essay or some kind of a report, um, perhaps an exam, something like that.
And what we then do is we treat that artifact as a proxy for the actual learning that's going on.
Because I need to remind everybody that learning is embodied.
It is something that you cannot outsource to somebody else.
It's a bit like your sleep, your exercise, your nutrition.
You cannot outsource learning. I cannot get somebody else to learn French.
For me, it is physically impossible to do.
So what we are doing here is we're actually using the performance in these artifacts as a proxy for the actual learning. Why do we do that?
Because it's efficient. Now this is how we tend to do it.
We want, we put our hands on our hearts that every student that crosses our graduation stage is at least in the good zone, they're at least just good enough that they have met the learning outcomes for their whole program study to a level that is just good enough. In Australia, we tend to use this measuring system to ascertain that that's happened.
And what we're focusing in on is that yellow line about warranting that students have absolutely met at least just good enough standard.
The problem we're facing right now is that since November last year chat, G p t has been able to produce artifacts of around about this quality round, about the just good enough, not nearly good enough kind of scale.
So it was about the end of our last academic year, but since March of this year, chat g PT moved on to GT four and it's now able to produce work of this kind of standard that has happened in between the end of last academic year and pretty much the start of this academic year.
So this leads us with a bit of a situation length about existential crisis. I don't think it's an existential crisis. On one hand, don't panic, but on the other hand, it's become really clear that doing nothing is no longer an option, which leaves us with a very important question to ask ourselves is what do we need to do? And who needs to do the doing?
And this is where I return to the work of Phil Dolin, who's a, a colleague of Professor Biman, who will be hearing from at a moment.
And I always say, why bother putting a boring old citation on the bottom of my slide where I can have a photograph of Phil looking very happy, holding his book.
This book I really strongly recommend to, um, participants here today.
Um, e even though it was written before, uh, generative AI really became a big thing, it's still incredibly, uh, focused on the same principles that we need to bear in mind.
So it's absolutely a fantastic resource, but he reminds us, he gives us many gifts, but he reminds us that cheating is both contextual and it's socially constructed. And I wanna give you an example of what that means. In real prac, in real world, the same technology and the same behavior in one context can be absolutely acceptable and even commendable that in another context is absolutely cheating. And here's an example, riding an e-bike.
Now, if I decided to leave my car at home and start riding an e-bike on my commute to work, most people would go, yeah, fair enough. Or even, yeah, good one more car off the road.
But if I was a competitor in the Tour de France, everybody would agree that that is mechanical doping that is cheating.
And so we need to remember that a lot of the doing needs to be done at the local level, at the arti, at the level of the actual learning, in the context, in the discipline that needs to have that learning demonstrated.
As Lanka has already explained to us, this is going to vary from discipline to discipline.
This is incredibly out of date already.
I took this photograph in February, 2023. Um, it's taken from, um, something that was shared Byther sim on Twitter.
It's a really great taxonomy that reminds us that what we're dealing with here is not just chat gp pt. Um, in fact, what you'll notice on this is chat BT here is is not even really referenced.
It's talked about as JBT three. Um, a lot of the things that we are seeing here, uh, expose kind of different ways in which generative AI can move from say text image, from text to video and so forth.
The ones on the other side of the slide that are not on the left hand side are some of the ones that I think students were heavily using before the launch of chat J p t.
And you can see some of them are specifically set up for writing essays.
Quil Bott was bought by Course Hero and you just need to take a second or two to figure out the business model going on there.
Most students are use a deep L for translation.
And of course a music students are using Shazam and listening tests.
So it's all a big, um, landscape out there that students are exploring.
And these are the kinds of conversations that students are having with themselves. This is a young woman called Libby Dunn.
She's a gymnast at Louisiana State University. She recently, um, posted on her TikTok feed a sponsored post from Cactus AI, which is one of the big essay writing, um, AI tools. Um, and obviously, as you can see, gave it a thumbs up. This was 10 seconds on TikTok.
Livy has 7 million TikTok followers.
So influencers are out there talking to each other about the benefits of using ChatGPT to cheat. So where does this leave us?
Well, on one side of the coin, we still need to be confident that all students have done their work themselves.
That has not changed. That hasn't changed because of generative AI.
It hasn't changed because of ChatGPT.
It hasn't changed because of contract cheating. That's always been the case.
But these contextual shocks are coming our way.
But with the rise of ChatGPT or generative AI, we need to also ask in a world where ChatGPT exists, what is the work?
And this is one of the key points that linker made in the presentation we've just heard. So back to Phil, um, he returned from a couple of, uh, 10 weeks away on sabbatical and he said, what did I miss on Twitter?
We still feel in that ChatGPT existential crisis.
It was about nine o'clock at night.
I happened to be feeling that ChatGPT existential crisis.
And I replied by saying, I've gone from worrying about not having enough evidence to prove that cheating has occurred to worrying about not having enough evidence to prove learning has occurred. This tweet seemed to resonate with quite a few people.
And the way I'm explaining it to people at the moment, we still need to turn to face the problem of finding evidence to prove that cheating has occurred.
But we need to remind ourselves of the importance of facing in the other direction to find evidence to prove that learning has occurred.
One of the consequences I think we'll find is that we're going to in see an increase in failure rate, and we have to think about why there is a reluctance to fail and there's lots of factors contributing to that.
But one thing we probably need to do is actually turn our energies in that direction.
If a student has cheated and they haven't demonstrated the learning outcomes and we fail them, but we don't refer them for serious academic misconduct, then we're often achieving the same outcomes.
But we are not actually getting to the heart of the message, but it's probably better than doing nothing.
So the analogy that I'm using at the moment, and this again chimes with some of the things that linker said just before, is that we need to rethink things.
This is a paradigm change that we're going through.
And paradigm changes are hard. They're disruptive, they can feel like existential crises, but let's remind ourselves what we're doing here.
We are in the very business of helping students climb Mount Everest.
We're getting a degree from a university. Is it, it's a big deal and we want them to scale to the very summit of achievement, but do we really need to see them trek to base camp every single time?
Now for this analogy to work, we all have to expect that we live in a world where altitude sickness doesn't exist. So just go with me on this one.
But do we really need to see that they contract to base camp every time now that there is a helicopter that can get them there in some instances, some context, yes, we will need to see that they contract to base camp every single time.
In some other instances, we might only need them to show us they can do that once or twice, three times safely, confidently.
In some instances we may never need to see that they could do that.
And another thing we need to think about is, well, who's piloting the helicopter and what's the helicopter made of?
And do we understand that?
So I think these are some of the conversations we need to be having with ourselves in terms of the substantive medium to long-term cha changes as Lanka put it, that we need to be thinking through and bringing in the idea of evaluative judgment.
I have a funny feeling Professor Bierman might talk about this is I think a really fundamentally important thing. I'll give you two quick examples of this.
Um, in a recent, uh, world for Sony War Photography award, the winner immediately admitted that they used AI to create the image.
The point that I think is really important here is the judges noted his interest in the creative possibilities, but also adding emphasis that the image heavily will of the photographic knowledge that he acquired before Chap two PT existed.
Another example, which is hot off the press as I just got this off Twitter this morning. Um, a an academic who got ac the students to look at Chacha PT generated assessments and then to look at them closely and they found that all 63 essay had what he calls hallucinated information or fabricated quotes.
And that the students were shocked that it could mislead them.
And it says probably 50% of them were unaware that it could do this.
But I think the really interesting thing here are the concerns by the students about what they call mental atrophy and the possibility from fake information and fake news, but also that they, they're recognizing that AI is both clever than them, but also dumber than them.
And that worry about getting to a point, not worrying so much about AI getting to where we are, but us getting to where AI is and how that is going to impoverish intellectually our world. Now remember, our core business is to help students move only towards graduation.
We need to keep focused on that. And this is where I'd like to just share, um, some thoughts in terms of how we might think about that in the bigger context.
And I'm using some theory from, um, John Braithwaite called responsive regulation that he's used in other contexts.
And I'm mashing it together with the two contextual frameworks, conceptual frameworks that Phil Dawson gives us in his book about academic integrity, which is the positive mission and me assessment security, which are measures taken to harm as assess, um, attempts to, to, um, a hard assessment from attempts to cheat and approaches to detect and prove where it has occurred. The first is cooperative, the second is adversarial.
And Phil says that on its own, academic integrity is enough and we need to bring assessment security in.
So this is work that I'm doing with Kay Murdoch and it takes this idea of the enforcement pyramid and we wanna put all students into this enforcement pyramid and map their attitudes on one side according to their willingness and ableness or ability to do the work of learning themselves for our champion and clients, students at the bottom who are both willing and able to our callous and confused who are not always able but willing our chances who are able, but unwilling to the criminal element at the top who are both willing and un unwilling and unable. Now, if we think about these people or these, these different types of attitudes at the bottom, sorry, that's a bit hard to read at the bottom.
The public and institutional risk is very low, but at the top it's very high.
And what we need to do is map our institutional strategies to respond to those attitudes, supporting and advising at the bottom, monitoring and middle and directing and compelling at the top.
One of the things we need to think about here is if we choose that top strategy and apply it to all those students at the bottom, it's gonna really annoy them. It's going to frustrate them and make, make them feel surveilled and untrusted.
But if we try to use the strategies at the bottom of the pyramid with the student attitudes that are at the top of the pyramid, it won't work.
It won't have any impact on them.
And if we think about this also from a a cost point of view, the strategies at the bottom of the pyramid are both resource intensive and emotionally costly.
So what we do then is we infill the pyramid with tactics to implement our strategies where, um, the ones at the bottom of the pyramid are also available for the students at the top of the pyramid.
But the students at the bottom of the pyramid don't need the tactics at the top of the pyramid. And if we owe, uh, we think about the main idea here is we wanna create downward pressure to encourage improvement and to get as many students down into the bottom of our pyramid as we can.
And we're doing that against the upward pressure of contract cheating, of generative AI and all sorts of other opportunities to cheat.
And if we overlay Phil Dawson's two discourses over the top of this where we can see where we need to put our energies and just as Birthweight and heirs project, the bulk of our institutional investment needs to be in that enforced self-regulation segment of the pyramid.
That's where we need to put our energies and our investments in strategic workforce planning in terms of big data gathering, sector-wide intelligence sharing and all sorts of other things that a lot of us are probably currently not doing. Okay, just to finish off, my main message today is we need to empty the value of cheating from our courses. Doesn't matter how students cheat, whether it's using generative AI or contract cheating, the value of cheating on our courses at the moment is very, very compelling.
And we need to empty that out. We can't secure everything.
This is another big message we get from Phil Dawson.
And no task can ever be completely secure. And this is my other big message.
We need to start with stopping. We cannot ask academics to do anymore right now.
And if something is futile, trying to do it harder doesn't make it any less futile.
And the analogy I'm using is trying to grasp or pick up a jelly.
It is impossible. It's futile trying to do that. It's, it's futile to try and secure a line exams, but more and more people are trying harder and harder to do it harder and harder to grab this jelly and it's just turning into a big sticky mess.
So thinking about where we put our energies, it's much easier to secure the learning and the knowledge on the right hand side of this spectrum than it is on the left hand side of this spectrum.
So maybe we should give up on trying to secure the staff that's factual and sexual and focus instead on trying to secure stuff that is procedural and metacognitive. Um, actually I'll just jump across that one. Um, I do think we're going through a paradigm shift, a paradigm change.
And I've gone back to Thomas Kane's work to revisit it. Um, he did doesn't actually use the word paradigm shift, he uses the word scientific revolution. But one of the things he says is, we know we're encountering a paradigm change when we're confronted by, um, by basically by questions, by anomalies or counter instances or questions we can't answer.
And I think that there are two critical questions that we can't answer at the moment. The first is how stop students from cheating? And the second is, how can we be sure that our graduates have learned what we need to be safe and competent professionals? Now in effect, if we can't answer those questions, this is what we are saying, we cannot g graduate, uh, guarantee that all graduates have met all the required learning outcomes for their program of study.
And we cannot guarantee that our graduates, graduates have not cheated on some, most or all of their assessment. And I put it to you, what can our brand, our he brand tolerate?
And I'll also just leave you with a, a quick plea that we need to introduce critical AI studies into our work.
We need to look at the neo-colonial exploitation that goes into building these tools as well as the social media tools that we use.
We need to think about the carbon cost, we need to think about cybersecurity and Samsung learn that to their peril.
All of these tools have ingested bias and there's also serious concerns about IP and copyright from artists and from indigenous peoples in particular.
And these tools, by their very nature look backwards, not forwards.
That's not what our sector should be doing though.
I'll end it there and pass back to.
The link between cheating and assessment
Professor Cath Ellis, University of New South Wales (Australia)
Video transcript
Download the presentation - Generative AI – the issues right here, right now [PDF, 4.3 MB]
Speaker: Professor Margaret
Professor Margaret: Thank you so much. Um, I'm hoping everyone can hear me. I'm going to,
I've got a slide show, so I'm going to, um, share with you now, and I, after all these years on Zoom,
I still often get this wrong, so let's hope that it works.
Um, oh, and it's not going to, if I am not gonna start in the right spot, right? That's better.
And yeah, that looks right to me.
So I'm just going to kick off and hope that, um,
hope that that's okay. I'm talking about generative AI, the issues right here,
right now. But I'm also gonna talk a bit about future as well.
And I wanna talk from an assessment design perspective. And thanks Lenka and [unclear], it was great.
[Acknowledgement of Country]
So in that cast to, you know, we're in this time of revolutions and paradigm shifts,
I think it's also really important to take pause and say, you know, um,
there are other, other traditions where education,
um, has happened and will continue to happen. And, um, to,
to acknowledge that. So the purpose of today,
I just wanna shine a little bit of light on the implications of generative AI
for university. I put university assessment in brackets,
but I think you can take from it from school education as well.
I've been talking a lot with school, college,
and I'm gonna start with implications for the short term.
And I wanna zoom out at the very end to implications for the longer term.
And I wanna start off, um, with, um,
defining assessment, um,
as the graded and non-graded tasks undertaken by enrolled students.
Part of their formal study with learner's performance is judged by others,
teachers or peers. And the reason I want to use this definition,
I want to put it front and center,
is because I want to not just focus on the moments of assessment where we grade
students, there are other moments of assessment,
and the teachers are not the only one doing the judging. In other words,
I want to return us to the inherent tension in assessment.
I work an assessment and assessment design. And one of the key things,
the interesting things about it is it has many functions,
but there's this really big chunk function between me needed to assure learning
and promote learning. And what I really want to, I mean,
that's probably familiar to everyone in this room, educators,
but I want to say that in this time we often get really caught up in the
assurance.
And I want to pull us a little bit back to the promoting learning fun function
as well. Because in any assessment design,
there's always a trade off between these two things and chat,
G P T and other generative ai, et cetera. Uh, um,
exacerbating that tension.
So I'm drawing here from the work of colleagues, um,
Jason LoJack brought Ben and Sarah Howard,
and they've really written this cute little,
it's great little LinkedIn post about how educational institutions are
responding to gen ai. And they've really got six categories,
and I think they've nailed it. They're ignoring it, banning it, ululating,
embracing, designing around and rethinking. And I think that,
um, we can agree that the first two sort of cath slides allude to this, that
ignoring, it's not really gonna work. We really have to say, look,
this is out here and banning it isn't really working.
We know that students are using it irrespective of bands. Um, it's,
it's been banned.
I I'm actually confused where the banning is going on in secondary schools.
It's been somewhat banned in different states in Australia. Um,
and I think it's been unban as well on and off,
private and public are doing differently. So banning is, is, is,
is confusing and I don't think it's gonna work.
So let's put those two to one side that leads us with three responses that are
in the short term, ate, embrace design around.
I wanna talk about those and then come at the end to rethink.
So should we embrace or design around or ate? That's really the question.
Well, embracing, I'd say embracing is inevitable.
It's gonna be in our inter it's gonna be in our enterprise software. You know,
launch a bar means Google's transform. Microsoft's not far behind.
So in our day-to-day things that we use in our institutions,
generative AI will be there, but it's still really uncertain.
And I think this is gonna go on for a year or two at least.
So some of the uncertainties around chat, G P T and other, other gen ais,
I'm putting chat here specifically because I know the most about it,
but legal uncertainties, who owns the prompts? Who owns the source material,
is copyright being contravened.
There's a lot of big question marks about that corpus and how it's being used.
So that's, you know, i I,
and there's gonna be legislative, um,
arrangements that are gonna come into place. On top of this.
There are ethical uncertainties,
there's issues of bias in the corpus issues of truth, epistemic colonialism.
There's all sorts of things going on here, um,
ethically that we're still feeling our way through their access issues without
enterprise models, their cost concerns.
Can we consume every assume that everyone can afford it?
What if it falls over during assessment in the way that we at the university
tend to, um,
have a great deal of confidence in the platforms we ask our students to use?
You know, how can we,
how can we in the embrace situation guarantee what we're doing?
And most significantly, we don't quite yet know how anyone's using it.
It's still really new. People are as are starting to, um,
experiment with it in professional workplaces and students and educators,
but it's still settling a little bit at particularly as a software develops in
different version comes out. So to a certain extent,
I'm going to say that large scale embracing is very difficult to do right now.
I'm not saying small scale,
but large scale embracing particularly in light of those legal and ethical
uncertainties.
So we moved to design around a task unit level,
and I think it's the most sensible right now option for many assessment tasks.
And this means that at a small scale level,
people may embrace because it's what's happening in certain disciplines already.
We know that software companies are really using gen AI to write codes.
It's gonna be reflected in the disciplinary nature of it.
More likely what's happening is everyone is concerned at, at, at,
at a unit level. And I mean there's the enterprise stuff,
and I'm talking smaller scale here,
the assessment design level concerned about inappropriate use of ai.
So people are trying to shift their tasks to try and avoid student passing off
AI work as their own. And yes,
if you've got an essay about the trolley problem in philosophy that you've been
using for 20 years, it's going to be a problem. This is a general proposition,
lots of suggestions, little evidence, yet if the knowledge is common,
then the task integrity is likely under threat.
And most suggestions of change changes to assessment, adjust,
adjust things around the commonality of the knowledge the student has to
represent knowledge that's common, not commonly available or even know.
And that the advantage of this is it doesn't interfere with the, the,
the purpose of assessment to promote learning. Well,
while some approaches will,
will sort of, you know, focus on the assurance rather than the learning.
So these are the sorts of possibilities, you know,
ways making knowledge requirements more specific.
So to lean into the relational, do you know your students?
If you know your students and they're producing sort of odd work, is is that's,
that's something that you can pin to particular time and place and that can
alert you to something might be, you know, more in class work or,
or synchronous work at any rate,
specifically requiring the assessment task to reference something that happened
in class. Rewarding originality, something that no one has ever done before,
can't be, can't be found. I mean, making the task more authentic.
Designing something into specific time and place,
making sure the rubric rewards a situational relational and success criteria.
Now these are in no way cheap proof.
I'm not suggesting that the influencer on TikTok couldn't get around these, but
I'm, I'm gonna suggest that from an assessment design perspective,
we can't be nailing it to making things cheat proof,
intentional cheating is very pervasive and very, very pernicious.
No cheetahs are gonna cheat.
And what I think we want to do is make it difficult.
We want to, in fact, possibly, even if people are going to start to cheat,
they're gonna have to actually learn in the process as well too.
Not ideal, but those are the sorts of ways I think to frame it.
So then we come back to the next question. What about individualization? Well,
what's wrong with Invi? Well, as Bill Dawson,
who Catherine mentioned points out the work of Brett Agaral shows that cheating
still goes on possibly at high rates.
So in ation does not stop to the best of our understanding.
Cheating and exams happens a lot and there are many negative
effects to invigilated times exams. It's costly, it's stressful,
a test capabilities unrelated to tasks. I've got a child doing sort of year 12,
um, in Victoria right now. And gee,
those tasks seem to be a lot about good handwriting and, um,
being able to do something in 45 minutes. Um,
and I don't know if those things really relate to his understanding of English
or history or other things and their problematic
in terms of diversity and inclusion and only a narrow band
of capabilities can be tested.
We're saying that you have to only do those things in this very short period.
For example, we could never say to, not that we would, but for example,
a novel cannot be written in 45 minutes under exam conditions.
There's a whole lot of things we're automatically excluding.
There's lots wrong within modulation. So rethinking ation,
now we're moving to the rethink may be key. So here's some early thoughts.
We've got something out floating, um,
on our cradle website or around some of these sorts of ideas.
And this is echoes a little bit for what Kath was saying,
prioritizing what needs to be ated across a program. Do we need hurdle tasks?
Say this,
I'm back in university in first year to say these are the skills that you need
to do to do the other things. A lot of opportunities to demonstrate knowledge,
um, engage in feedback without invigoration. It's, it's up to the students.
And then at the point of it graduation,
we come back to a little bit of invigoration where outcomes must be assured and
the invigoration might be a move towards orals. Again,
problem problematic in terms of people freeze,
but maybe they could be dialogic rather than the surveilling.
Maybe the whole point of these orals is not really to show how much you know,
but just, you know a little bit about what you're talking about. PhD defense.
I've been seeing a few of those in the European systems. It's you,
you've passed, it's not really you're gonna fail,
but if you can't really don't really sound like you know anything,
like there's not a link between you, the person and the work that you did,
that's a problem.
And then to move towards assessment of learning outcomes across tasks rather
than just within them. So at the moment we say, well,
you can do X year and we tick it off, but what if we need to say X, Y,
and Z in this, in this essay, in this oral, in this moment.
You've demonstrated sort of these higher order capabilities here.
One of my most favorite topics, um, and um,
I think this is something really need to do is rethinking your curriculum to
account for ai.
And one of the things I think that we need to think about is standards.
And this is where evaluative judgment comes in.
That c was talking about what counts as good.
We have this sort of idea about machines that they always produce accurate
responses. It's where the calculator comes in.
I think we need to think about gen AI more like that,
that that uncle that you have that talks large,
but actually may not know what they're talking about.
We need to be able to unpack what it is that, um,
that people are saying, um,
and to see that there are absolute gems in there,
but they're things that may not be also be right. And we need to,
all of us and our students start to attune to what good looks like,
because that is also where we deal with things like ethics and so forth as well.
What do we want to count? So conclusions, sorry,
rapid Gallup through all of this ignore ban and visualize embrace design around
rethink some thoughts for you.
Artificial intelligence has already made huge inroads into our society.
It remains an evolving and uncertain presence. And I,
I would like to point you to the fact that I think that AI has been here for a
long time as well too. Google is powered by ai.
We have a lot of shaping going on in our world that we could usefully attune to
at this moment.
How we choose to address its presence in our assessment designs requires
thinking broadly and not narrowly. And I just,
my last plea assessment is not just about testing,
it's always an intervention into learning.
So whatever we do in our assessment designs will affect how students learn.
And I think I left some time for questions, but, um, I don't know whether,
whether that's gonna fit with this agenda. So thank you very much.
10. Thank you Professor Margaret. Unfortunately,
we do not have any times for questions. Uh, so we will, uh,
just move on and carry on 10.
Generative AI - the issues right here, right now
Professor Margaret Bearman, Centre for Research in Assessment and Digital Learning, Deakin University (Australia)
Video transcript
Achieving with integrity in academia: The aspiration and its obstacles
Associate Professor Jason Stephens, University of Auckland
Video transcript
Transcript coming soon.
Download the presentation - How AI is impacting my school [PDF, 2.1 MB]
How AI is impacting my school
Claire Amos, Albany Senior High School & Kit Willett, Selwyn College
Video transcript
Transcript coming soon.
Perspectives on AI panel
Jason Stephens (Convenor), Claire Amos (Albany Senior High School, Kit Willett (Selwyn College), student representatives
Video transcript
Transcript coming soon.
AI forum - New Zealand panel
Gabriela Mazorra de Cos (Convenor, Executive Council Member, AI Forum New Zealand), Professor Michael Witbrock (University of Auckland), Dr Karaitiana Taiuru (Taiuru & Associates Ltd)
Video transcript
Transcript coming soon.
Provider responses panel
Jenny Poskitt (Convenor, Associate Professor at Massey University), Dr Mark Nichols (Te Pūkenga), Kevin Shedlock (Te Herenga Waka - Victoria University of Wellington), Sue Townshend (Le Cordon Bleu, ITENZ)
Video transcript
Transcript coming soon.
Reflections on the day and next steps
Jenny Poskitt (Associate Professor at Massey University), Ellen MacGregor-Reid (Deputy Secretary, Ministry of Education), Dr Grant Klinkum (Chief Executive, NZQA)