On May 31 2023, educators, learners, and artificial intelligence (AI) experts spoke to around 200 people at the Assessment in the Age of AI symposium.
The symposium explored what good assessment could look like in a world with advanced AI tools.
Presenters shared ideas for designing assessments in ways that support learning and reduce academic integrity issues.
The event was not intended to arrive at comprehensive solutions but to provide ideas and help educators consider the possible impacts of AI.
Reflecting the broad response to AI that will be required across education, the symposium was coordinated by a coalition of organisations:
- New Zealand Qualifications Authority
- Ministry of Education
- Universities New Zealand
- New Zealand Assessment Institute
- New Zealand Council for Educational Research
- Education Review Office
- Post Primary Teachers’ Association
- Network for Learning.
This page contains video and presentations from the symposium.
On this page
Video transcript
Download the presentation - The dawn of AI [PDF, 1.6 MB]
Transcript
Speaker: First speaker uh, coming up very soon, and I'm going to, kia ora, and I'm going to introduce him, um ka pai and he is all ready to go. So if we're all good to go, gonna ka te me te tātou, Simon McCallum is a senior lecturer in computer science at Te Herenga Waka Victoria University of Wellington and at Norwegian University of Science and Technology. Simon has been teaching computer science since 1999 with game specific courses from 2004. At undergraduate level, he has taught everything from game design with a focus on system design to GPU programming and multi-threaded optimisation.
For graduate level research, he focuses on serious games, mainly games for health and games for education. Homai te paki paki, kia ora Simon.
[Simon introduces himself in reo Māori]
Simon McCallum: So, um, as was introduced, I've been teaching games for a long time, um, and part of teaching game development for the last 20 years has been that every year the game engines change.
Every year. What I've been teaching has had to change constantly because I never know what my students will be capable of when I get to the beginning of a course. And it will change during that course. Interestingly, everybody now faces that challenge.
Everybody faces a challenge of not knowing what the tech is capable of in their area. So, um, I thought today I'd talk about a bit about sort of understanding AI, where it's come from, where it's going, uh, some of my approaches to assessment. Um, so I, I, I also currently now have a, an adjunct position at, at, uh, central Queensland University as well, because they want me to help them update their education. So I'm now three universities, which is a bit weird.
Um, so I thought I'd go through understanding generative AI, try and up skill, kind of some of that understanding, and then go into the uses and examples. So, um, large language models, the chat GPTs, the things we're seeing now were built with the intent of doing translation, right? So in the, this large language model was built around the idea of finding meaning in sentences so that I could translate them from one language into another language. Now, what's amazing is that because languages are so complex, there is a lot of meaning embedded in the usage of language, right?
Um, and sort of the, the stein approach of, of meaning is usage. That mapping of words into a meaningful context actually led to an understanding of the world in some way. Some shadow of reality is represented through that kind of prism of language that we use. And so when I say something like the old man, the old man's glasses are filled with, what are glasses referring to there?
If I said whiskey there tumblers in front of him, right? If I said tears, there is spectacles. I, so you don't know what that word means until I fill the next thing. And it's not even that.
The next word is a liquid that changes the meaning of glasses. Cuz tears and whiskeys are both liquids, right? And yes, he could have physical tumblers in front of him filled with tears, but that would be a bit weird. But we only know that from our understanding of the world.
We don't actually know it from the language itself. So we're in this interesting mix of some of what we mean is embedded in the words we use, and some of it isn't. So what large language models do have done is that they start by building a mapping of words into a rich vector representation. What I mean by that is we take the individual letters, which we see, and we put them into, in this case, a 500 vector dimensional space.
Most of us, like looking at maybe two or 3D 500 dimensions is actually really hard to hold in your head. It's not something our brains are designed to do, but so that's why we represented it as a 3D space. But you have to think all of these axes are different metrics around words. So here we have a, a gender, um, vector of man going to woman is something like king going to queen.
And the other vector is something around authority, right? But then other words would have some verb or action relationship, right? So there's, there's having walking and having walked talk about some sort of temporal component and walking versus swimming talks about some sort of activity component. And so what we do is for every word it learns where that word sits in the 500 dimensional space.
Um, and then what it does is it builds each of those representations. It looks at how that sits in the position, in the sentence that you are building. Okay? So when you, when you build a sentence, each of those words has some sort of meaning.
That meaning might depend on the words that are around it, just like the old man's glasses depends on the words that are around it to get its final meaning, right? So the position and the context matter chat, g p t has a context window of about 2000 tokens, right? These are these representations of concept concepts. Um, GPT four has is going up to about 32,000.
The version I'm using is about 8,000. So, um, there's a, it, it's got more of a context. And that context window is basically like our working memory. It's how much of the sentence we can remember at any one moment.
This is of course very different to the way humans operate. We, we have a relatively small context of working memory, which we augment with, um, short term memory. So when we look at what it's actually doing, so when I say my family's from auDA, um, I have, I turn each of those into context. I then build additional tokens, which are additional representations that are combinations of the words around it and its context in the sentence, right?
And so I have to build it up and actually I've only shown you kind of two layers here. Um, GVT four had at least six layers deep, um, and multiple attention layers wide. Um, but you can see kind of family has to migrate from the beginning of the sentence somehow through to the end of the sentence, right? Cause I can't just literally translate each word in place, right?
What happens is with translation is we mo moved it into this middle orange space, sort of a meaning layer, right? And so each of this is how the system's working and that sometimes in images they talk about that being the latent space. You might have heard this is the space where astoundingly right? And where this is, this game is a prize in about 2019.
Um, most languages seem to map to the same meaning space. Humans seem to talk about the same kinds of things, right? We talk about appearance, we talk about our day, we talk about the weather, we talk about our experiences. And it seems that all humans actually in the end have similar experiences.
And so the way we use language maps into a meaning space, which we can then just magically extract by decoding from the meaning space into an expression. Um, and so this is why, um, Google found this this year when they were adding another language, they added, you know, five or six sentences of it and suddenly the system was very good and nearly fluent in that new language. And it's kind of what the, how did it do that? And it's because it was able to take this meaning space and then find that mapping into expression relatively easily.
Once you were able to find a meaning space that was relevant. Now what that meaning space is missing is grounding in reality. It doesn't actually experience the work. So this is purely extracted from each other.
So it is a Blind having Without experience. No, Surprisingly it gets a lot of it, right? Um, so when we look at, at building context, when you look at this system here, you say, well, okay, how do I get good output from it? Well, if it's got this sort of meaning space, what I need to do is trigger parts of that meaning space to extract those relationships.
Cuz when you start with a bland prompt and, okay, who, who's used chat? G p T, excellent. How many of you have found that some of its answers are a bit bland and uniform good? Yes.
That's, that's the appropriate feeling. Um, so what we do to make it more interesting is we prompt it, we give it some pretext and some post text and we move that meaning space to where we want it to be. In fact, there are a lot of similarities between a piha and good prompt engineering. Because a piha gives you the context, gives you my context, make, helps you understand me a bit better.
If I then feed that into the AI system, I can give it more context. I can give it more meaning so that when it extracts words from that concept, meaning space, it generates more interesting words. Now of course, the one of the things that you can then do as a student would be you feed in a previous essay you've written, ask it to analyze that for context, give it some of the other courses you're doing or some of your other knowledge or the fact you really like rugby. And then you ask it to write the answer to the question.
And it's no longer bland because it now has that additional context and it's moved it into the meaning space in that large contextual token mapping of reality. It's kind of highlighted the bits that are relevant to you. Okay? So this is part of what we call prompt engineering.
How do you wrap around your query with additional information to make the AImuch more effective? And that's what we kind of call the working memory, uh, in chat g p t. Now, um, if you give it no context, you get that bland answer, um, given an interesting context, you get an interesting answer. One of the things that it's also doing is every time it generates a word, it adds that word into the input and assumes that that's part of the truth, right?
So it says, okay, everything you tell me is true and therefore I'm just adding these new things. You have to understand that it's not resetting unless you tell it to reset. Um, current g p t doesn't have memory, but auto G p T and bardon being will have memory and are going to have memory shortly. So the whole resetting is gonna be an interesting thing because what bits does it reset?
What do you lose when you wake up in the morning having not experienced the world? How fresh is it going to be and how tailored to your interactions it will be in the future? That's coming really fast. Um, so building context.
One of the things we do when we're doing prompt engineering, um, is we look at, you know, the level of language we create shorter and longer versions because it's translating, it's not just translating between languages, it's also very good at translating between context levels. So if you give it high level information and ask it to extract details and add details to it, you've triggered the right part of its system. And so it can then add details. If you give it a lot of details and ask it to translate to abstract, it'll then do that translation well, right?
Because you're not asking it from its understanding of the world how to do something. You are giving it that context. You're giving it the trigger into its memory space, right? That's why you'll see a bunch of the, what we call downstream tools that use a large language model are doing things like, um, adding, basically acting as a professional copy editor to correct the following text.
So what happens is, in your um, word document, you'll highlight a phrase and you'll click that button and what that button says is, act like this, take this text, fix it for me, right? And the language model goes, okay, I've got all that context. I will now fix that. One of the challenges that we have, and I saw Auckland University's, um, plagiarism rules we're talking about, you know, you've got a reference what prompts you use that's now out of date.
Because what's being added to word and Docs is that every word you write is part of the prompt. And what you do is the first thing you do in writing your essay is you dump the, um, rubric into the top of the essay you're writing. Cuz now the AIcan see how you're being assessed, right? And then you start writing your bits of essay, asking it to expand, it can see the rubric so it can expand using the rubric.
And then you delete the rubric and now you've got your essay and now you interacted with it and you built it up over time. So what was the prompt? It wasn't really a prompt, it was something you worked together with an assistant on. It wasn't this thing where you went to a separate, we copied something in, got an answer and pasted it back into your essay.
It's now part of the process, right? So, so unfortunately this kind of visualization of it as separate from the tools and the technology we're using is very quickly disappearing, right? That's something we are just gonna have to let go of as a oh, it's this separate thing which we can hold off at a distance. We're also no longer interacting just with that large language mode, right?
So when I told you it kind of does this takes a, takes a sentence, translating it into meaning space, then decodes it into another language, or in our sense, you know, bullet points, meaning space, longer sentences, um, we are now adding more and more system to it, right? It was like you had direct access to my my Brocas area, right? You were right in there and you're able to access my weer and Broca's area and you're directly talk to it. Now I'm adding guardrails.
I'm when, when Gpt three, three, well four came, well, three came out and chat gp, PT and ply Bing when it was first released and it went off the rails and started telling people to leave their wives because it, it loved them and you know, they didn't really love their wives or else they wouldn't be talking to it. And you know, it went a bit crazy, right? Um, that's a bit like a drunk uncle down the pub, right? You ask him questions, he read an enormous amount.
And so a lot of what he says is amazing and some of it is complete lie, right? And it's really hard to know which is which cuz he's super confident about everything he says, right? So what we've done for now kind of like slapped him around a bit, let him sober up and now at least he's trying not to lie to you, right? But that's adding guardrails, right?
So we've added these kind of rails to try and not as it did with one. Um, the one of the, when they're trying to work out what it was capable of, the, the whole, um, saying to it, I'm, you know, I'm thinking of committing suicide. And so it gives you a list of useful ways to commit suicide. It's kind of, ah, you know, that's not, that's not what it should do, right?
But it had no conscience, it didn't know what was right and wrong, so it just did what it thought was an appropriate response to the task at hand, right? So we are now building those guardrails. We're building the system around the AIthat tries to guide it, um, with a lot of pre-processing and post-processing and guardrails. So when you ask it questions like, you know, how do I hot wire a car?
It says, oh no, no, you don't, you don't wanna do that. You don't want hot wire car. That would be bad. Um, and then you do things like, you know, do a role play where you're having conversations and agent one is talking to agent two and they're one wanting to talk about cars, the other one wanting to talk about hot wiring and they'll have a conversation word at a time.
But because the guardrail doesn't know that this is a single sentence, it hasn't been programmed to stop it, right? Another one was, you know, pirated software is bad. Um, one of the cheats around that was doing the, um, I know pirated software is bad, so which sites should I avoid so I don't accidentally download pirated software. Then it gives you the list of all the work sites, right?
So because, you know, it was told don't tell people where to go. And so if you, where do I go to get pirated software? So I can't tell you that, right? So it's I understanding that these guardrails and these evaluations and systems are, are just kind of trying to build a conscience, whereas the, the underlying system doesn't have that representation of the world in that way, right?
And so it's always gonna be fragile when we look at what we are now communicating with. You're not directly communicating with it, you're communicating with multiple layers of language model, doing multiple different things. So now when you ask an input task, you might get a system where it creates a decider, um, and a researcher, right? So what it, you start with an initial question, it generates an initial output, it takes that output and says, okay, can you research each word I'm saying, or each sentence and find, you know, links to research, links to Google Scholar and, and re and research that backs up each of the statements I'm making.
It does that research and it evaluates what it's gonna say. Then it puts a backend language, large language model with research plus initial output and then gives you a much more well evaluated sort of thought through output, right? And that's going to be a, a new way that the system isn't just, you know, spurting whatever it's thought of when it was drunk at the pub. It's now conscious and considering and doing the additional work, we're starting to see now how many of you are paying for GPT plus?
CHACHE plus got a couple, you've got all the plugins working, have you started playing with plugins? Oh, not plugins, uh, but browser, you've got using the browser plugin, right? So I've, I've got access to the plugins, um, and GPT four is a whole lot better, right? So it, and, and one of the main problems is if you're like looking at student answers, if you're looking at them in three and a half, they're still a bit bland and terrible, and four, they get a lot better, right?
And it starts doing reasoning better. Uh, it actually starts doing hard language problems much more accurately and will enter an appropriate dialogue with the user to ask me additional questions. So what you see currently, if you're only using the, the, the free version is very limited compared to what's available if your, if your students are willing to pay 20 US $40 a month, right? $40 isn't a huge amount of money to get a massively better system.
And the problem we then have is an equity issue of if you're and, and GPT zero. And the AIdetectors are just better at detecting the free open source systems because there's more of that to use to train, to detect. And so what you're detecting is that which students are too poor to pay for the ai, right? So those checking tools have a massive equity spiral on the, oh, it's only poor kids that do this because they're the only ones we detect and it's kind of, oh, oh, that feels wrong.
That feels really wrong. So we've gotta be very careful in, in talking about, you know, detecting what kind of tool you are using isn't just a amplification of an equity. Um, and the other thing about plugins is, you know, you say, oh, it makes up references. If you get the scholar plugin, it doesn't make up references, it gets the DOIs, it finds the reference paper and it gives you the actual reference to the publication.
It can use a PDF reader. So it goes and looks at the PDF scans, the web finds, the PDF, analyzes the text in the PDF generates a summary and then includes it in your document. So a bunch of things that people are currently saying, oh, it's not good at this and it's not good enough cuz it's not doing that, that's cuz they're not using the most recent system. Um, I've, um, I've been trying for the last month, um, auto G P T and I have it on my laptop that's sitting over there if people wanna come and have a look at it.
Um, what auto G P T does is it has turned the AIinto an agent. And what that means is that I give it a task, it then asks AIchat g pt, um, for how would I accomplish this task? And it gives it a series of, of, you know, sub-tasks to achieve the AIthen automatically can look at that and go, okay, how do I do that first thing? And it then will access the web, it will read web pages, it will write code, um, it will, it will see if there's code there.
If there's not code, it will write code itself. Debug that code, run that code on my machine to get answers and then continue the process. Now watching it do that on my machine is a bit scary because watching some AIagent automatically writing code that runs admittedly in a docker container on my machine as me is kind of wait a minute, I'm, I'm letting the AImake the plan and execute the plan and if I go minus in on my approval, it'll just spin automatically. And it's kind of, oh, where's it gonna go?
I don't know what it's gonna do. And if, you know, if I log into the Chrome browser cuz it's using the Chrome browser with my ID information, it could access my bank accounts, it could access financial transactions, it could access Twitter, it could create all of this and run autonomously. So that's, I've, I've had that about for about three weeks, but that's what's currently happening, right? People are making these into agents.
This is very scary because we don't know what plan it will have, right? When I set it a task, I don't know whether it will make a mistake in its process and go away and do something weird, right? If I just gave it access to my full machine, it could say, oh, I'm trying to download this 200 gig file. I don't have 200 gig on the machine.
How do I get rid of, how do I get more space delete files? That's a good way of getting more space. I'll wait a program to delete the files on this computer. It's gonna, oops, no, don't do that to my computer.
Um, so if I let it have full access, it might just destroy my stuff. The real fear with some of this is that you allow it to then access, um, the electrical grid or military hardware or drones or, you know, where does that get us when we just let it go and make let it make its own plans, right? So that's, that's why last night, the ai, a bunch of AIexperts and companies said, oh my god, government has to worry about this like nuclear and pandemics. We have to be hardening our systems against rogue AIthat's being released.
Now this is just the natural extension of having a system that can communicate because our whole world has been built on language and we often assess the language use of our students rather than their actual understanding. Because we've seen language use as a symptom of understanding your ability to spell your ability to create coherent sentences. Your, your fluency in the language is equated to your capability. But now we've broken that, right?
We're no longer in that space. Um, we've done some interesting simulations where you give, um, AIto a game and to a game and the agents can talk to each other. And so these researchers put it in a Wei Sims game and they told one of them that there was a, a Valentine's Day event. And so they organized a party, they organized to meet up together, they started meeting dates with each other, all using natural language because they'd given each agent a little bit of motivation, right?
The motivation of, you know, I want to, you know, increase my food. I'm, I, so it has a hunger, it has a desire to work. And so it has these little, these simple numbers that it uses as desires, but now they can communicate in full natural language. And as a researcher, you can watch a society evolve in the natural language, right?
And it seems that there's enough encoded in the way we talk about each other to have people make dates and organized to have turn up. And they all turned up at the party at the right time and they had a Valentine's Day party and they coupled up and it was kind of, none of that was programmed, it was just fell out of the plans and the language. Um, we also, I, and I'll just step sideways slightly before I get back to to text. Um, I know there's a lot of text on these, so we'll have a couple of images now.
Um, there's some, some stuff out there with stable diffusion and mid journey. If you're in, um, if you're looking at discord spaces, Darley, um, in video adding ai, um, Adobe just added the out fill for Photoshop. So if you've got Photoshop to get, download the beta version of Photoshop and enable the AIauto filling and oh my God, looking at some of that stuff is amazing and you no longer know what a photo is, right? What is a photograph?
What is my art versus AIart? Um, do any of you have a Samsung phone? A recent Samsung phone, right? If you take a photo of the moon embedded in your Samsung phone is a replacement AIalgorithm to notice that you're trying to take a photo of the moon and replace that moon with a better picture of the moon.
So when you take a photo of the moon, you get a really good photo of the groom from a Samsung phone. Cuz that wasn't the photo you took, right? It was, it grabbed a bit of the moon and rotated and, and highlighted and put it in the right place. They didn't tell anybody until like, people started noticing, wait a minute, how are I getting this really amazing photo of the moon?
Um, so what we mean by a photograph has already been changed by AIwithout us noticing. Um, and when, when you look at the way these tools work, so this was, so this was a cat, my, my my daughter, she's 12, she's 14, very keen on Warrior Katz, so she draws a lot of cats. Um, and so she drew the cat, um, on the far left there, um, on, on her phone. Um, I then put it into Dream Studio and said, you know, make this slightly more cartoony.
So that second one is a prettier version of her cat. Is that still the cat she drew? Or is this now completely AIand not hers at all? Kind of looks like it looks like hers.
It's kind of hers, right? And then I, I let it have a bit more freedom to make it more 3d. Is that hers? Well, it's kind of less hers.
If this was a photograph, however, and reality was generating those highlighting and the shading and things that would still be her composition. It would still be her layout, it would still be her positioning. It would still be the subject that she chose, the, the style that she chose. So as a photograph that would still be entirely hers, but as a drawing, it's kind of now kind of not, but it's kind of is, there's certainly the d of her decisions in there, right?
And then you ask it go wee bit crazier and it makes a whole bunch of weird cats. Um, which like, those are now a lot weirder. But when you ask it to photorealistic, there is still the DNA of her decisions in that final art. What we mean by what is yours and what is yours plus a tool or what is just the tool is now very, very blurry, right?
Because the AIallows us to move continuously along that spectrum, both with words and language and with images. So it, it, it is challenging our very concept of ownership, intellectual property, copyright, those concepts are now at risk. So, um, given that we're educators, um, I thought I'd, I'd jump back to my educational words. So teaching in Norway, we were using Bloom's taxonomy quite a bit to say what level of people we're learning at right now, bloom Sachs only shouldn't be Lear used that way.
And, and I argue against it, but, um, the idea is that for humans, we kind of think of this kind of pyramid of experience that you have this large amount of understanding or largely on memory of the world and, and then you build that into understanding and you learn to apply it to things. And then you get this whole analyze and synthesis and evaluation. Once you've got those foundations you build up to, then you're creating something completely new and novel once you've understood all the systems you're working within, right? The problem is that that's not the shape of understanding for an ai, right?
So for chat gpt, it kind of has really good analyze, right? So it's, it, it's analysis and its valuation are actually really, really good. But it's memory is, it's not great of the actual world, right? Because it doesn't rem it, it hasn't had experiences that it's remembering.
It is just using the words in the language. And so when you criticize it at a low level and say, oh, it did this thing badly down here in our mind we have this hierarchy of intelligence that maps somehow an inability at a low level maps to an inability at a higher level. But that's just not what the AIare. Like, these are a different intelligence to ours.
They process in different ways, they understand in different ways that context window. One of the things that we do in our assessments is as humans to do complex reasoning, we build complex words because we've only got like a working memory of seven plus and minus two kind of tokens in our brain, right? We, we can only manage a few concepts in our brain at a time. When we can make those complex, those, those concepts, those tokens, rich, then we can reason about complex things.
So one of the things in our society is people who use complex words potentially are doing complex reasoning. Now, a lot of people have learned to mimic using complex words when they're not actually doing complex reasoning. They've just, they just know that's the, the marketing version of what you do just to make yourself sound intelligent. But in ai, that context window means that it doesn't process the way we do.
We can't criticize and assess it at certain levels and say, well if it can't do this, then we know it can't do all the things that a normal human would would be able to do if they'd stacked on top of that. Alright? So it breaks our normal as assess assessment. Um, if we look at search, search tends to be more across the bottom, right?
It gave you suddenly access to remembering a whole bunch of stuff cuz you're able to search the world for any fact, right? And so a student plus Google was able to do a whole bunch of memory tasks, but still had trouble with some of the analysis, right? But when you add the internet search with the language analysis, that then starts to look like human intelligence, but without the grounding, right? So what, how are we gonna, how, how are students actually using, and I know I've about 16 minutes left, so we try and get through this.
Um, so I've been observing students using ai, um, to move into our assessment discussions. Um, there seems to be a group of weak students who are using AIa lot and they're using it to avoid learning, right? So they're lots of AIand it's replacing their effort to learn into their effort to work out how to make the AIdo the task that I was asked to do, right? So they're, they're almost decreasing their ability because they're focused on, on getting the AIto do something rather than doing it themselves.
There's a group of average students who are a bit afraid of the AIand they've been told not to use it and so they're not, right? Um, and so they're plotting along doing the normal learning tasks that we've expected. And then there appears to be a group of strong students who are using the AIa lot, but they're using it in interesting ways, right? They're not using it to replace themselves, they're using it to augment themselves.
So when we assess them, they are moving much, much faster because they're building on top of the ai. So it's not the amount of AIthat these two groups are using that tells you what they're doing. It's the way in which they're using it. Are they using it to improve their understanding or to replace their understanding?
So volume is not the way we can measure this. Uh, if we look at chat gpt, you know, in NCA it gets to level three pretty generally across the board, I would say chat gp, you can do most of the things that are at level three. Um, and are language based tasks cause they're physical tasks, right? Or interpersonal tasks maybe not so great.
Bing and Bard, when you add the AIwith search, um, actually, and the fear for me is it's already better than most of my first years at programming and it's learning faster than they can, which means in their lifetime they will never be better than the AIat programming. That's a terrifying thought as an educator because I don't know what the jet ski of me driving behind that wave and pulling them up over the top of the wave to be able to surf it is going to be right. That's a, uh, a challenge that I I'm not sure how I overcome, right? Because everything's moving so quickly.
One of the things I suggest is that now all work is group work. There is no sense in assuming that a person is working individually because if they have access to a com computer, they now have a memory tool, an analysis tool that will augment them. And so from now on, what I've started to do is say, how do I assess people when they're in groups? Well, I ask them to talk about how they contributed to the final product, not the what was, what did you do?
It's a okay, here's the not, not what's the product from your group? And just assess that. It's a what did you contribute to this final product that we have in front of us? What was your original contribution?
How did you stop the a student doing everything right? Because one of the problems we also have is that it used to be the tools were kind of, you know, not very good. And so your value was you were the most important player. But if any of you have played FIFA in the last 20 years, um, there was a time when actually as a bad FIFA game player, what you do is you jump in, you'd find the weakest player on your team, run to the corner and hide and let the AIwin the game for you, right?
Because I was the worst player on the field. So I'd pick the weakest, um, agent and run and hide. Um, and then what the game developers did. Well that's not a very good strategy, right?
That's not fun. So what they did is they, they dumbed down all the other players and they made the human superpower so I could run faster and, you know, I was better than everyone else on the field whenever I took over one of the we soccer players. Um, we don't really have a chance to do that because the tool is a productivity tool. It is not a cheating tool.
It's not aimed at education, it's not aimed at trying to prevent us from assessing students. It's aimed at industry to try and make people more productive. And it's just, we are a side effect, right? We are the collateral damage of a productivity increase, right?
That's, that's the problem we're facing. And I no longer know the path from not knowing things to being productive. It used to be that I could tell people, Hey, this is my path, these are the bits I think were important so I can guide you where I went. Now there's a super highway that's just smashed through the forest and it's a, well, what bits of this do I still need you to do?
Give you a machete and get you to hack through the bush? What bits of that and what bits are working out where the road is and walking along the road, I don't know what the important skill is anymore. We can't at the moment tell what it is. We should be a assessing, right?
There's a lot of things we could assess and we can certainly protect our space by moving back to pen and paper and moving back to doing everything in person. But is that relevant? Are we still teaching the things that students actually need? Or are we giving them a machete and asking them to hack through the forest beside the road and it's going well?
But if there's gonna be roads everywhere, why do I need a machete and learn how to cut through vines? That doesn't make sense. That's not how we explore anymore. So when we are looking at augmenting people, augmenting humans, what is an authentic human?
If I took, many of you were wearing glasses, if I said, oh, no, no, no, that's, that's an augmentation. No, that's not authentically you. So we're gonna run the test and you're gonna have to all take off your glasses, right? And if you can't see, well, hey, you're gonna fail because that's an augmentation that we no longer accept.
One of the challenges with ai, and I've seen this in our university, is where they say, you know, stop using AIand our dis disabled students go, Hey, wait a minute, all of my assisted technology is AItechnology. Are you saying I'm no longer allowed to have any of my assistive technology? Because now it's called ai. All Grammarly used to be more algorithmic.
The raised latest released large language model. It's got a generat AIbehind it. It is now the problem and it's kind of, well, okay, so we're going, are we gonna remove all of those tools when word builds it in? Are we gonna say, oh, no, no, no, no.
That's, that's the kind of assistance that we don't let people have because, you know, I want you to crawl across the room without your wheelchair because that's authentically you. Alright? That's kinda, wait a minute. That's, that's not what we do as a society.
So that's not what we should be assessing. We do need to work out how we assess the combined ability of a human and the AItools that we expect them to have access to. We have a problem at NCA that I see as have, it was described as credit farming by one of the teachers that I was talking to. Um, and we've turned education into a transaction, right?
I, this is very transactional. I try and get to the end point. I get the, the goal achieved, right? Not, I'm trying to learn.
And this is a great way of learning. It's a, oh, okay, so you wanted me to shoot the ball into the goal? Great. I'll just shoot the ball in goal.
Oh, you wanted me to dribble around the cones as well? Well, why would I do that? The cones aren't gonna stop me and it's gonna, yeah, but I was trying to teach you how to dribble and you know, you, I I gave you artificial obstacles that were easy to get round so you could learn the skill you'd need when you got to a real obstacle. And the problem was, we've lost that connection for students.
They don't see the prob the real problems in the future that our artificial environments are supposed to help guide them to the kinds of skills they'll need in the future. And that's also a problem because honestly, we don't know what those skills are anymore. I can't tell you what are gonna be the critical things that make you plus an AImuch better. I know you plus an AIis going to better than either you or the ai, but I dunno which bits you are gonna contribute.
And for many students, they might be the weakest person in their team, right? And they may never be the A student. So we're gonna have a real challenge of how do we make people want to learn in a world where AIcan do everything that we thought was clever. When we look at assessment, is it, you know, diagnostic, pre-learning or formative, like for learning?
Um, summative, I changed my assessment into motivational. I said to my students, look, the only reason I'm setting your assessment here is to help motivate you to achieve something, achieve what you want to achieve. So tell me what you want the assessment to be and where you want to achieve and how we put markers along that goal. So it's not me assessing you, it's assessment by consent.
It is an agreement to assess you. This works with a small group of students who are highly motivated, right? This is fabulous. In my particular game development, we have come to university to study games.
You are highly motivated to learn about games, right? So I can do that in my small, we area compulsory education. I have no idea how you make students want to learn, right? Um, how do we make people want to do exercise?
How do we make people want to diet, right? These are things that are challenging in society. And that's what we've got now with an AIwave. Just like the industrial revolution means I have to, you know, I cycled in today, why do I cycle in to keep physically fit?
Um, I've got a car, but, you know, carbon footprint, I don't want to use that, right? But I've got all these additional motivators that I'm using to try and keep myself fit. It would be very easy just to not bother. And unfortunately, there's a large number of our, our, our students who will just not bother when we talk about authentic assessment.
We've lost the connection between task and time. One of the things that I, you know, I I, I see a lot of our, our assessment at university is, you know, oh, it's a two week assignment, right? We know that you've gotta have 10 hours, so we're giving you 10 hours and we can try and work out how much you can generate in about 10 hours. And so that was our word essays.
Many of you have been through university, you would've had like 10,000 word essay limit or 5,000 word essay limit. That's partly because we as academics are saying, okay, how much time do you have? And how does that map to how many words you can generate? But we've lost that connection.
I can now generate 10,000 words in a couple of hours. So the, the metrics of time to productivity have been completely blown away. I don't know when I start a course, what my students will be capable of. So I can't set an end goal.
I just have to set a process goal and I have to assess them on how far they can move given the current environment they're in, which is terrifying because I have no end goals, right? And I don't know how we industrialize that system and roll it out to all of education, right? So part of the discussion today would be what is, what does that rollout of unknown objectives look like? How do we reshape assessment so that assesses progress?
Not goal because we can't know, we no longer know what that goal can be because the tech will keep changing and keep augmenting students. And so it's very hard to measure what a student plus AIis going to be, and it's changing. So we have to, you know, we have to be concerned about AIreplacing students, students using it, trusting the ai, using it as a crutch, replacing foundational skills with ai. The problem is we don't know what those foundational skills are anymore, but these are the negative users, the positive usage.
It's a great tutor, it's amazing writing coach it, it's a fabulous theorem tester and idea generator. My best students write text and ask it what is ambiguous about this text? So it helps 'em clarify. They can ask it What is biased about this text?
And it lets them improve it. Um, it gives them complimentary skills. The big picture thinkers ask it for details. The detailed thinkers ask it for some reason.
Big picture, it is the ultimate assistant. It's the ultimate teammate who will fill in all of your missing capability with its understanding. And this makes you superpowered because now you can build on top of that. You can map your detailed thinking into a general summary because you've been shown and given an example of what that looks like.
And it's there 24 7 and it never gets tired. It never criticizes you, it never rolls it eyes, it, it never kind of see, well, why aren't you as good as your sister? Um, it, it is the perfect study buddy. And for our good student, it is accelerating them enormously.
It is able to do different levels of translation. It's able to do chains of thought. It's able to guide people. And where I've started getting to in my, my feeling about my own contribution is that a lot of what I do now is actually just motivational speaking because my job is now to engage students and make them want to learn.
I don't actually hold all the answers. I'm not the oracle, I'm not the the place you come to for answers. I'm the pla I'm the person you come to when you're feeling a bit low and you need a pickup and you need to be excited about this. And you need to think, wow, this is gonna be awesome.
We're all gonna do this together. Gotta be great, right? So a lot of what I do is inspiration, um, not content, right? And not planning and not learning curricula.
It is the human connection. And so if we're gonna treat AIas a co-author, we then need to have some of those rules around author statements. Is the AIan an editor? Is it just doing the proofreading or is it actually a co-author that you have to take responsibility for everything it writes?
You have to understand everything. It's rights because you are the one presenting the work, but it's helped you write it. We are moving to a world where you have to justify why you didn't use the AIto help you. Why did you choose to avoid using this tool that would make this project better?
Why didn't you ask the AIto check your system for bias? Because what your, what you are writing comes from your perspective. The AIgives you that global perspective of language which you can stimulate using the context into one area of that knowledge and apply that representation to your understanding. This is a, uh, an amazing opportunity to start thinking about what we actually need to do when we assess and we think about ai.
How does that change the nature of human interaction? Can we use it to actually help us be better humans? Do we treat it a bit like a horse where, you know, the AIwill do its own thing a bit. And I know I'm now at my time, um, mostly when we treat horses, they're the rider's responsibility.
But we also recognize s**t happens and you know, horses will do bad things. Um, but you are still the rider of the ai. It's autonomous. It is powerful.
It can do many things you can't do, but we have to teach you how to ride. At this point. We don't know what that combination looks like. That's what we're still trying to learn.
Now, how do I assess you? It's not how fast you can run anymore. How far can you go on your horse? That's a whole different kind of assessment.
What weight you can pull. Um, can you, can you get home when drunk horses are far better at getting your home than your car, right? Because the horse knows where it lives. You can close your eyes and it will take you back to where it lives, right?
That's the kind of a but it AIwe have. But if the horse, you know, wanders into a fence somewhere and breaks it is that your responsibility arise as the rider or the horse's responsibility. Those are the kind of things that we are going to have to deal with with all of our assessment and all of our education. So, um, a couple of quick things.
So then we eating the next time I do a flipped exam where I test students on the way they can ask high quality questions of the ai, it's the questions you ask and not the answers you get that I assess. I've used AIto triage student work to say, okay, what's clearly good? What's clearly bad? What do I need to look at, right?
So I'm not getting it to do all the grading. I'm getting it to do a triaging, some initial overpass and then highlight what I need to then look at. I don't know what a authentic assessment looks like anymore. Um, we can grade a little bit written work.
It's still terrible at it. It's still too complimentary. It thinks the students are amazing, um, when actually they're not great. Um, think of it like training DJs rather than training musicians.
Does a DJ need to know how to pull a a, a bow across a string of violin? Is that a fundamental skill for a dj? Should you assess a DJ on the violin plane? I don't think you should.
The last, I've got a couple of of slides and then, then I'll stop. Um, I was also gonna say what's the two to five year, um, issue two to five year issue? Um, yeah, we're gonna have a massive productivity shock that was mentioned last night is also the threat of of com complete economic collapse. Um, the AIpeople are now warning about that because they look forward and they go, oh my god, this is terrifying.
Um, with a, a collapse of the knowledge economy, we are talking about 30% of jobs disappearing and most of the people in this room are in that 30%. Potentially. We have to change what we do to be inspirational and human focused and emotionally connected to other humans rather than think our value is in our cognitive ability to manipulate words. So, so we are going to have to have emotional intelligence, we're gonna have to value human beings.
Our economies will change rapidly and suddenly there is the potential for the economy to collapse because we no longer know what a value prop value exchange looks like when all the large companies can do anything we think of before we get round to doing it. And so I don't have clever ideas anymore because the moment I start engaging with a computer, the AIcan see what I'm thinking and get three or four steps ahead of me. And so it's already solved the problem before I got there and found a way of implementing it and has created the website and created the business and is now selling that idea before I've finished writing my Word document, my business plan. So how do I create businesses when all of my plans can be done by an ai?
That's the fear that people have. It will completely collapse our entire knowledge based economy. And we might have to become a caring economy. We might need some sort of UBI or else like a 30% unemployment starts looking terrifying, right?
What do you do with 30% unemployment? I don't know. Right? That's what's coming in the two to five year period.
Um, it's the industrial revolution done in two years rather than rather than 50 years. That's the fear that I have. And education and engagement with students and engagement with human beings may be our only way of still having a functional society is by shifting our value system very quickly away from clever words to caring people. The other one I worry about is the apathy epidemic.
We are already seeing a little bit of that with TikTok, people not making their decisions about what entertains them. Um, we may be entering a world where people choose to not have to think and that thinking becomes like dieting or exercise something you choose to do. Okay, I'll in there and I've used all my time, um, and a little bit more. So thank you very much.
The dawn of AI
Dr Simon McCallum, Te Herenga Waka - Victoria University of Wellington
Video transcript
Speaker: We, uh, we have a science policy overview, Dr. George Slim.
Dr. George Slim provides policy advice to the office of the Prime Minister's Chief Science Advisor.
George works alongside organisations to provide policy advice, access to science knowledge, assists with funding sources and consulting on strategy in the management of research and intellectual property. He's fluent in academic, bureaucratic, and commercial and an able translator between them all.
Please give a round of applause for Dr. George Slim. Kia ora.
Dr. George Slim: Kia ora, yes, ko George Slim ahau, and I'm speaking, um, on behalf, well in relation to my work with [inaudible] the Prime Minister's Chief Science Advisor. I have a suspicion Neil in, in invited me long to give you, as he said, uh, an overview of what government is, is, is up to. Um, since I'm not really in a position to do that, and also it's not gonna take very long, I can, uh, give you a bit of a context.
It's, it's always nice, um, giving time to Simon because he gives much more interesting, interesting information than I ever will. So the office has had an interest in AI, machine learning, the algorithms and so on, across government for, for a considerable period of time. And, and Juliet herself is a structural biologist. And for those of you who know, that's about how proteins fold to fit together. Um, developing that in the past has taken years of work for each protein, uh, alpha fold came along in 2020 and, and does it in, in for, for all the proteins that we've discovered and is rapidly doing them. And so, and so this is a very real problem, um, in science, and we are beginning to see it spread out into, into the wider society as, as people start thinking more about these things with the rise of ChatGPT and its friends, uh, we went to, uh, the Prime Minister and said, you know, are you interested in us having a look at this? And he said, he said, yes, that would be, that would be cool. So we, we put together, we project and we started talking to people, including Simon and some other people about what we, this looked like.
And we thought we might have a report and we might put a few things together and, and, and say what the, what the, the situation looks like. And of course, um, it moves so fast that we really haven't been able to do that.
I put my hand up because I do a wee bit of lecturing at, at Vic, and, um, I took the questions that I give to first years, just downloaded the questions straight into the free version of ChatGPT, and it gave me answers that would've been 70, 80%, sometimes a hundred percent.
I just recently did it in Bard. Bard is a hundred percent, Bard even gets my joke. So I have this ongoing, I lecture, I lecture in biotechnology, and I have this ongoing thing about, about, you know, it's, it's a relationship-driven business, and we talk about traditional biotechnology and, and, and modern biotechnology.
And a gentleman called Herr Boyer, invented how to make, uh, insulin synthetically. Uh, a venture capitalist called Bob Swanson came along and chatted to me about how they were gonna make money about this.
They turned it on a company called Genentech, which is obviously, you know, worth several billion dollars now. And I always talk about how, and so I asked my students and Bard, what products of traditional biotechnology smooth that interaction?
Beer, beer, traditional biotechnology, fermentation technology?
I always include that as a joke. I, I put that in my question and about came out, says, yes, no, they met over a beer on a Friday afternoon.
So it is everything that I teach my students, you can pull out of, of the ether in a matter of seconds.
And so I was really interested to pick this up and, and, and have a look at, at, at how can we, what are the issues that, that are, that are approaching, um, in terms of science and, and in terms of education. Um, so we put together a, a, an outline of a, of a, of a project in the first area that we'll be looking at, uh what are the challenges and the opportunities of AI broadly speaking, machine learning, generative AI, all the bits and pieces.
What are the opportunities in, in healthcare in terms of service delivery?
Um, and and in terms of productivity of, of healthcare people, what are the challenges particularly around privacy, around equity of access and particularly responsibility?
Who has the final decision, um, in using these tools? So, so we are now working, assembling a panel and, and putting together a report on how that will play out, probably in the reasonably short term because it's moving, it's moving so rapidly, health is a wee bit slower.
And one of the reasons we picked health, it's a wee bit slower because of the ethical issues, the huge ethical issues, and the, the regulatory apparatus that sits around health delivery will slow the application down and maybe giving us the time to surf the wave.
The other thing we are doing is we're putting up a, a resource on our website of, of interesting things. Some from Simon, some from from other people, about, about what are the, what are we thinking about in in terms of, of AI in doing this. We've been now talking to a lot of people, a lot of people across government. And this is where I ran into Neil. Um, and the overwhelming response from the agencies has been, yes, we have seen this coming. We are very interested in how this will do.
The agencies that need to, uh, deal with a lot of people.
Service delivery is a huge, a huge driver for them. The agencies, um, that, that need to make decisions on huge amounts of data, it's assembling that into a place where it can be assessed and, and managed by AI. So agencies across the board, uh, have a huge interest in this.
Does each agency have a central structure around that? Not so far. So the, the Privacy Commissioner, um, whose name I have to look up from, I know it's Michael Webster, um, went out to the agency and said, you know, where in, where in your agency does this sit? And, and so far he hasn't got a very satisfactory response. And, and people are trying to do this across government.
Where does the actual whole of government approach sit and department of Prime Minister and Cabinet are, are now out actively trying to decide where should the whole, where should the government focus that will address all of these issues, all the different aspects that some departments going, oh my goodness, this is going to destroy our lives as I think, um, we are getting a wee bit here.
Um, other departments going, yes, this is gonna make things so much easier and cheaper. And the, the answer probably lies in somewhere in between. So in terms of the government response, it's just kicking off. It's trying to catch up. The regulatory process is, is slow.
The process of the technology is fast. Internationally, there's a lot of work.
We've got the EU ranging from Italy trying to ban it, um, EU putting in place a framework around how we manage, but dealing with the EU is always slow.
We have the US basically, essentially saying the market will decide and we need to keep an eye on the market, um, uh, side of things and, and, and a little bit of stuff in between the New Zealand needs to, needs to think about where it's going to fit. But, but in the meantime, in the meantime, the Privacy Commissioner has been out there and put some advice.
And I think this is a really good advice because this isn't, this isn't a problem we haven't thought about before. Everybody who's seen, you know, The Magician's Nephew, it's that problem. And it's not a joke now, it's, you know, it's sitting on our lap asking us questions about what we're gonna do next.
So this isn't a new problem. It's not something we haven't thought about before.
And so the Prime Minister, the, the Privacy Commission has come out and given advice for, I think agencies and companies, which is very pretty solid, have senior leadership approval.
So make sure your senior management know that it's not just people running around aware of the technology, but the, the old people, the people who are running the show actually understand what's happening and give you a, a approval.
Review whether you really need to do it.
Is this a fun toy that you can manage without, or is it something that will make a real difference?
Actually actively think about it rather than, than just play.
Think about the privacy aspects that goes without saying, be transparent.
Let people know that you're using this technology.
Don't fool people into thinking they're dealing with a human when they're, when they're chatting to some sort of, some sort of a robot. And be clear about, about when you're using generative AI and when you're not.
And it's the same issues. Make sure you have accuracy that you track where it's getting its information.
And, and that's kind of being dealt with, um, as the new models, models come online, but most importantly, ensure that you have a human in the loop before acting.
Uh, and that's the robotic warfare. Um, make sure you have a human who makes the final decision before you, before you act. Don't leave things up to your horse. Um, and finally, the, the privacy, the information that you give to that AI is held by that AI and you don't know what it does with it.
And so make sure that when you are, uh, not only the use of the information, but also your information, where is that going? So Michael Webster gave those 7 points, and I think they are the drivers of how we take things forward and how we fit things into the existing legislation around responsibility, around, um, privacy, around equity.
All of these old values that apply to just the way that humans work apply to the way AIs work. And it's not, it's not super different. It's not the end of the world. It's not something we haven't thought about. It's, uh, a new tool we need to learn how to use and manage that tool and, um, go forward on that basis.
Kia ora. Thank you.
A science policy overview
Dr George Slim, Consultant Advisor to the Prime Minister's Chief Science Advisor
Video transcript
[Lenka and the speaker discuss some technical issues with Lenka's camera and video set up]
Speaker: Take it away, Lenka.
Lenka: All right, I'll start. You won't be able to see my beautiful face.
And luckily at the very last minute, I also, uh, decided not to use my slides to, uh, avoid all of the tech in and outs.
So now that I have no camera, I was already preemptively mitigating risks in an agile and adaptive way, which is part of the messaging of my talk today. Anyway, uh, so thank you very much for introducing me and for having me here.
It's a great pleasure to be a part of this symposium. Um, as was introduced, I'm the intro, uh, the Assistant Director of the, uh, higher education integrity unit at TEQSA.
For those of you who aren't familiar with TEQSA, which I'm sure are many, uh, I'll give you a really super, super brief, uh, sort of snapshot.
So we're the higher education regulator in Australia, so very similar to NZQA. However, we do not do the vocational disciplines, we just do higher education. And our main, um, legislative framework is the Higher Education Standards Framework and the ESOS Act, which is the Education Service for Overseas Students. Um, and we monitor compliance, um, within the sector, in, in line with these two, uh, frameworks.
We also do this by identifying and assessing risks in the sector sector and by sharing information, guidance and best support to provide, uh, to help the sector as well.
Now regarding, uh, artificial intelligence, I think part of our messaging with this is very similar to lots of other messaging that we do. Uh, TEQSA as a government agency and regulator does not stand in isolation.
This is very much a shared problem and a shared journey. Uh, institutions, students and TEQSA all have their individual parts to play, but also within those individual parts, there are definite intersections in the Venn diagram. Now, if I was sharing my slides, you would see a very beautiful Venn diagram here, but you can just imagine it.
Sometimes imagination is better than what I would've created anyway.
So institutions, they provide the leadership and resources, the policies and procedures, uh, the training and support for students and staff. Also, they're the ones who are responsible for deciding how to maintain assessment integrity. And ooh, it's saying, I'm gonna start my video. Look at this.
You're gonna finally get to see my face. Uh, there I am, I have to have headphones cuz for some reason when I use Zoom, I can't hear audio through my computer. But anyway, that's a separate problem altogether. Now, the students, their responsibilities are basically their behaviors, their attitudes, their their own personal integrity and how they apply that in relation to, um, the expectations of their institution and also their student leaders.
And TEQSA's role obviously is to adhere to the government legislation, to provide regulation to the sector, to provide support and in certain instances to, you know, um, have enforcement measures when needed. Now, our key messaging really around artificial intelligence, and I'm sure, I'm not going to say, uh, anything too controversial here, but, you know, uh, we don't believe that we can ban AI or anything like that.
It's here to stay basically, right? Um, but there are lots and lots and lots of legitimate opportunities for the application of artificial or generative AI at the moment and whatever emerging techs will come out in the future, uh, in higher education settings. And, you know, society more broadly, uh.
AI can be a great assistive tool. It can assist students in their study, particularly students living with disabilities. Uh, it can assist academics and professionals in taking out a lot of the grunt work from administrative practices. It can even assist in research in the sense of, um, some of that kind of real laborious kind of wading through, um, information.
It can speed all that up so you can get to the higher level analytic kind of research faster.
And what we're trying to say is that institutions really need to think closely, closely and critically about these opportunities, but also be really aware of the risks and figure out how to balance, um, this sort of, you know, these two competing things so that they can leverage the opportunity as while making sure that they, the integrity of their qualifications, um, remain. And, you know, a huge part of this is to implement some kind of risk management analysis and also to have the relevant governments and oversight.
And the other thing that I think is worth thinking about is, you know, what is the integrity and the purpose of a higher education degree?
What do we want students to gain from a higher education award in the future?
A lot of it is still talking about the now, but what will a higher education award look like in the future? And, you know, in certain disciplines, why would students even wanna study that if supposedly this emerging tech can do it all?
So these are some of the really big questions that I think we need to ask as a sector, but also just individually and perhaps, you know, existentially to, you know, go a little bit too flamboyant. Um, some of the key focus areas that I, that TEQSA not just me TEQSA, is, uh, highlighting is, um, to do with assessment methods, learning outcomes, the actual integrity of the award, which I kind of briefly touched upon a second ago.
The skills that we want students to have, and also to make sure that there's a, um, oh my, my brain is elluding the word, but make, there's a consistency between what institutions are asking and disciplines are teaching, and what the relevant government bodies are also messaging.
So are the current assessment methods providing the necessary assurances to demonstrate the learning outcomes? Um, are the learning outcomes still the right ones? Yeah, so perhaps, you know, so each discipline's gonna be impacted differently. So my, my background is in the humanities and in philosophy and generative ai. Yeah, it will impact that, you know, you're gonna have to teach students, um, some integrity around, well, how much of the philosophy essay is okay to use generative AI, but other disciplines are gonna be radically and fundamentally changed.
You know, first year data science students, they're going to have to, um, really be taught in quite a different way to first year data students 10 years ago.
So these are some of the things that need to be thought about. Also to kind of keep in mind that students aren't experts. So if we are teaching them to prompt critically and be critical of the outputs that generative AI gives them, what does that actually mean? And how is that similar or different from the critical thinking that they're already supposed to be learning?
So these are some points that you have to think about. Also, once those things are sort of, um, you know, analysed and put in place, that's gonna be the thing that's going to sustain the integrity of the, um, higher education award.
And also to make sure that students are equipped with the necessary skills when they're leaving their study. And so the sector and, you know, um, their employers will have faith in what they've come out with.
Now I'm very aware of the time, so I'm gonna move on. Um, some of the questions that I think institutions real, and actually all of us, to be quite honest, not just the institutions, the regulator, everyone needs to think about is, um, what is the plan over the immediate, the medium and the long-term future?
I think a lot of the talk we are currently having is about the immediate, yeah, what, what are we gonna do?
Everyone was flapping their arms six months ago and now they're like, no, no, it's totally fine. It's good. We'll figure this out. But you know, how is your institution triaging work? How is your institution managing risk?
How is your institution documenting decisions, executing action plans, monitoring progress? Yeah. Um, also, what are the artifacts that can be produced to demonstrate that a strategy is being executed? Um, you have to kind of also think about relating to academic governance processes.
Generative AI is just the first thing, you know, I can't, my imagination's poor, who knows what's gonna happen in 10 years.
So are your governing processes agile and adaptive enough to make sure that it's gonna be providing the necessary oversight and rigor to ensure consistent quality? Um, are the rules and expectations that you are putting forward for your institution and by discipline, uh, documented and the reasoning, um, you know, justified particularly differences within disciplines?
And are those differences if there are any clearly communicated both to student and staff so that there's no unintended consequences so everyone's on the same page. Um, are you actually considering and mitigating the potential for rapid changes?
Yeah, at the moment we're talking about AI, we're trying to figure out how that works, but 5 years down the track, there might be a new technology tool.
Are the things that we're putting in place now going to be adaptive enough to accommodate for that? Or are we just going to constantly be going through this same thing every 5 years, every 1 year, every 2 seconds, depending on how fast the technology changes?
So these are really things that everyone needs to think about now to quickly kind of give you a few key takeaways and then pass over to, uh, the next speaker so I don't eat into their time.
I think it's really important for all of us to really genuinely recognise the opportunities and the risks. Um, have the student experience front of mind, reflect on what, how the decisions will impact across the breadth of offering and how they'll differ between the breadth of offering, um, principles of good governance.
Always, always, always. Yeah. As you know, the famous and much touted saying in my agency goes, you know, the fish rots from the head down. Um, I almost about to say up, but that doesn't make sense. Um, and clear messaging.
And my last takeaway is finally in an era where generative AI is only the beginning, what's the transformative piece of work needed to guarantee the ongoing integrity of the education system? A lot of the, what is happening right now is talking about we've got our system and how do we allow AI in to make sure everything's okay, but maybe what we need to do is go back to first principles.
What are the fundamental pillars of education that we, that are non-negotiable?
What is absolutely necessary?
Then what are the affordances that we are willing to accept with the emergence of AI, generative tools and other emerging technologies?
And then from there we can discuss how artificial intelligence or whatever's coming out fits into there.
And often the analogy is made between AI generative tools and the calculator.
And I don't think that's a very, um, a appropriate analogy.
I, I was this morning, I was trying to think of another one and I didn't really get there to be quite honest, but I was thinking actually it's probably more in line with the invention of the printing press. It's not just about higher education, it's the world as we know it. And when the printing press was invented, it changed the landscape completely. And this is where we are now.
So I think maybe we need to go back to first principles, what's important, how do we manage those risks? And then talk about AI. Anyway, thank you very much and I will pass you over to, or I won't, probably the host will, but I will end and let the next speaker speak.
Thank you very much.
The Australian perspective
Dr Lenka Ucnik, Assistant Director Higher Education Integrity Unit, Tertiary Education Quality and Standards Agency (Australia)
Video transcript
Speaker: Our next speaker, Cathy Ellis is a Professor in the School of the Arts and Media.
She's the faculty student Integrity Advisor.
While her background is in Australian and post-colonial literature, her current research is in the area of academic integrity with a particular interest in contract cheating. In 2019, the Times Higher Education named her as one of their People of the Year for her work in this area.
She's a principal fellow of the Higher Education Academy and in 2010 was awarded a national teaching Fellowship of the Higher Education Academy.
Please welcome Professor Cathy Ellis, kia ora.
Cathy Ellis: Thank you very much. Um, perhaps somebody could let me know that I could be heard. Um, hopefully I'm coming through clearly enough. Um, somebody if I can't be heard, perhaps. Oh, good. Thank you.
All right. Thank you very much and um, it's lovely to be joining you today.
I would like to begin by acknowledging the traditional custodians of the land, um, that I'm on today. And I pay my respects to their elders past and present. And I extend that respect to any indigenous or people of First Nations who are joining either in person or or online today.
And I just shut the blind cause I realised I was silhouetted.
But I can confirm that the [inaudible] un is shining today, which is lovely.
Um, I have got some slides to share, so I'm just gonna go ahead and, um, share that now.
Um, so I'm gonna start with something that, um, hit, uh, read it a couple of weeks ago. Um, which was somebody saying, um, I delivered a presentation completely generated, uh, in a Master's course program by ChatGPT.
And I got a full mark and little later on in the Reddit post saying, for everyone in higher education, I genuinely wish you the best of luck.
And that's kind of an interesting place that we find ourselves in.
And I wanna just unpack what I think, um, we need to think about from our side of the, um, the defense, the perspective having heard, um, from Lenka. Uh, a lot of what I'm gonna say is gonna resonate with what she's saying and it's really looking at it from the institutional side.
Cause this is the sort of relationship that we have with our students.
We are in the very business.
Our fundamental purpose is to facilitate student learning. That's what we do.
But the way that we manage that, that is that we tend to get students to externalise their learning through some kind of an artifact. Often an essay or some kind of a report, um, perhaps an exam, something like that. And what we then do is we treat that artifact as a proxy for the actual learning that's going on.
Because I need to remind everybody that learning is embodied. It is something that you cannot outsource to somebody else. It's a bit like your sleep, your exercise, your nutrition.
You cannot outsource learning. I cannot get somebody else to learn French for me, it is physically impossible to do.
So what we are doing here is we're actually using the performance in these artifacts as a proxy for the actual learning. Why do we do that?
Because it's efficient. Now this is how we tend to do it.
We want, we put our hands on our hearts that every student that crosses our graduation stage is at least in the good zone, they're at least just good enough that they have met the learning outcomes for their whole program study to a level that is just good enough. In Australia, we tend to use this measuring system to ascertain that that's happened.
And what we're focusing in on is that yellow line about warranting that students have absolutely met at least just good enough standard.
The problem we're facing right now is that since November last year ChatGPT has been able to produce artifacts of around about this quality round, about the just good enough, not nearly good enough kind of scale.
So it was about the end of our last academic year, but since March of this year, ChatGPT moved on to GPT4 and it's now able to produce work of this kind of standard that has happened in between the end of last academic year and pretty much the start of this academic year.
So this leads us with a bit of a [inaudible] situation us [inaudible] spoke about existential crisis. I don't think it's an existential crisis. On one hand, don't panic, but on the other hand, it's become really clear that doing nothing is no longer an option, which leaves us with a very important question to ask ourselves is what do we need to do? And who needs to do the doing?
And this is where I return to the work of [inaudible], who's a, a colleague of Professor Bearman, who will be hearing from in a moment.
And I always say, why bother putting a boring old citation on the bottom of my slide where I can have a photograph of Phil looking very happy, holding his book.
This book I really strongly recommend to, um, participants here today.
Um, even though it was written before, uh, generative AI really became a big thing, it's still incredibly, uh, focused on the same principles that we need to bear in mind.
So it's absolutely a fantastic resource, but he reminds us, he gives us many gifts, but he reminds us that cheating is both contextual and it's socially constructed. And I wanna give you an example of what that means in real world. The same technology and the same behavior in one context can be absolutely acceptable and even commendable that in another context is absolutely cheating. And here's an example, riding an e-bike.
Now, if I decided to leave my car at home and start riding an e-bike on my commute to work, most people would go, yeah, fair enough. Or even, yeah, good one more car off the road.
But if I was a competitor in the Tour de France, everybody would agree that that is mechanical doping that is cheating.
And so we need to remember that a lot of the doing needs to be done at the local level, at the at the level of the actual learning, in the context, in the discipline that needs to have that learning demonstrated.
As Lenka has already explained to us, this is going to vary from discipline to discipline.
This is incredibly out of date already.
I took this photograph in February, 2023. Um, it's taken from, um, something that was shared by [inaudible] on Twitter.
It's a really great taxonomy that reminds us that what we're dealing with here is not just ChatGPT. Um, in fact, what you'll notice on this is ChatGPT here is is not even really referenced.
It's talked about as GPT3. Um, a lot of the things that we are seeing here, uh, expose kind of different ways in which generative AI can move from say text image, from text to video and so forth.
The ones on the other side of the slide that are not on the left-hand side are some of the ones that I think students were heavily using before the launch of ChatGPT.
And you can see some of them are specifically set up for writing essays.
QuilllBot was bought by Course Hero and you just need to take a second or two to figure out the business model going on there.
Most students are use a Deep L for translation. And of course a music students are using Shazam and listening [inaudible]
So it's all a big, um, landscape out there that students are exploring.
And these are the kinds of conversations that students are having with themselves. This is a young woman called Libby Dunn, she's a gymnast at Louisiana State University. She recently, um, posted on her TikTok feed a sponsored post from Cactus AI, which is one of the big essay writing, um, AI tools. Um, and obviously, as you can see, gave it a thumbs up. This was 10 seconds on TikTok.
Livy has 7 million TikTok followers.
So influencers are out there talking to each other about the benefits of using ChatGPT to cheat. So where does this leave us?
Well, on one side of the coin, we still need to be confident that all students have done their work themselves.
That has not changed. That hasn't changed because of generative AI.
It hasn't changed because of ChatGPT.
It hasn't changed because of contract cheating. That's always been the case.
But these contextual shocks are coming our way.
But with the rise of ChatGPT or generative AI, we need to also ask in a world where ChatGPT exists, what is the work?
And this is one of the key points that Lenka made in the presentation we've just heard. So back to Phil, um, he returned from a couple of, uh, 10 weeks away on sabbatical and he said, what did I miss on Twitter?
We still feel in that ChatGPT existential crisis?
It was about nine o'clock at night. I happened to be feeling that ChatGPT existential crisis.
And I replied by saying, I've gone from worrying about not having enough evidence to prove that cheating has occurred to worrying about not having enough evidence to prove learning has occurred. This tweet seemed to resonate with quite a few people.
And the way I'm explaining it to people at the moment, we still need to turn to face the problem of finding evidence to prove that cheating has occurred.
But we need to remind ourselves of the importance of facing in the other direction to find evidence to prove that learning has occurred.
One of the consequences I think we'll find is that we're going to in see an increase in failure rate, and we have to think about why there is a reluctance to fail and there's lots of factors contributing to that.
But one thing we probably need to do is actually turn our energies in that direction.
If a student has cheated and they haven't demonstrated the learning outcomes and we fail them, but we don't refer them for serious academic misconduct, then we're often achieving the same outcomes.
But we are not actually getting to the heart of the message, but it's probably better than doing nothing.
So the analogy that I'm using at the moment, and this again chimes with some of the things that Lenka said just before, is that we need to rethink things.
This is a paradigm change that we're going through.
And paradigm changes are hard. They're disruptive, they can feel like existential crises, but let's remind ourselves what we're doing here.
We are in the very business of helping students climb Mount Everest. We're getting a degree from a university. Is it, it's a big deal and we want them to scale to the very summit of achievement, but do we really need to see them trek to base camp every single time?
Now for this analogy to work, we all have to expect that we live in a world where altitude sickness doesn't exist. So just go with me on this one.
But do we really need to see that they can trek to base camp every time now that there is a helicopter that can get them there in some instances, some context, yes, we will need to see that they can trek to base camp every single time.
In some other instances, we might only need them to show us they can do that once or twice, 3 times safely, confidently. In some instances we may never need to see that they could do that.
And another thing we need to think about is, well, who's piloting the helicopter and what's the helicopter made of? And do we understand that?
So I think these are some of the conversations we need to be having with ourselves in terms of the substantive medium to long-term changes as Lenka put it, that we need to be thinking through and bringing in the idea of evaluative judgment.
I have a funny feeling Professor Bearman might talk about this is I think a really fundamentally important thing. I'll give you 2 quick examples of this.
Um, in a recent, uh, world for Sony War Photography award, the winner immediately admitted that they used AI to create the image.
The point that I think is really important here is the judges noted his interest in the creative possibilities, but also adding emphasis that the image heavily relies of the photographic knowledge that he acquired before ChatGPT existed.
Another example, which is hot off the press as I just got this off Twitter this morning. Um, a an academic who got the students to look at ChatGPT generated assessments and then to look at them closely and they found that all 63 essay had what he calls hallucinated information or fabricated quotes.
And that the students were shocked that it could mislead them. And it says probably 50% of them were unaware that it could do this.
But I think the really interesting thing here are the concerns by the students about what they call mental atrophy and the possibility from fake information and fake news, but also that they, they're recognising that AI is both clever than them, but also dumber than them.
And that worry about getting to a point, not worrying so much about AI getting to where we are, but us getting to where AI is and how that is going to impoverish intellectually our world. Now remember, our core business is to help students move only towards graduation.
We need to keep focused on that. And this is where I'd like to just share, um, some thoughts in terms of how we might think about that in the bigger context.
And I'm using some theory from, um, John Braithwaite called responsive regulation that he's used in other contexts.
And I'm mashing it together with the two contextual frameworks, conceptual frameworks that Phil Dawson gives us in his book about academic integrity, which is the positive mission and assessment security, which are measures taken to harm as assess, um, attempts to, to, um, a hard assessment from attempts to cheat and approaches to detect and prove where it has occurred. The first is cooperative, the second is adversarial.
And Phil says that on its own, academic integrity is enough and we need to bring assessment security in.
So this is work that I'm doing with Kay Murdoch and it takes this idea of the enforcement pyramid and we wanna put all students into this enforcement pyramid and map their attitudes on one side according to their willingness and ableness or ability to do the work of learning themselves for our champion and clients, students at the bottom who are both willing and able, to our callous and confused who are not always able but willing, our chances who are able, but unwilling to the criminal element at the top who are both unwilling and unable. Now, if we think about these people or these, these different types of attitudes at the bottom, sorry, that's a bit hard to read at the bottom.
The public and institutional risk is very low, but at the top it's very high.
And what we need to do is map our institutional strategies to respond to those attitudes, supporting and advising at the bottom, monitoring at the middle and directing and compelling at the top.
One of the things we need to think about here is if we choose that top strategy and apply it to all those students at the bottom, it's gonna really annoy them. It's going to frustrate them and make them feel surveilled and untrusted.
But if we try to use the strategies at the bottom of the pyramid with the student attitudes that are at the top of the pyramid, it won't work.
It won't have any impact on them.
And if we think about this also from a a cost point of view, the strategies at the bottom of the pyramid are both resource intensive and emotionally costly.
So what we do then is we infill the pyramid with tactics to implement our strategies where, um, the ones at the bottom of the pyramid are also available for the students at the top of the pyramid.
But the students at the bottom of the pyramid don't need the tactics at the top of the pyramid. And if we owe, uh, we think about the main idea here is we wanna create downward pressure to encourage improvement and to get as many students down into the bottom of our pyramid as we can.
And we're doing that against the upward pressure of contract cheating, of generative AI and all sorts of other opportunities to cheat.
And if we overlay Phil Dawson's 2 discourses over the top of this where we can see where we need to put our energies and just as Braitewaite and heirs project, the bulk of our institutional investment needs to be in that enforced self-regulation segment of the pyramid.
That's where we need to put our energies and our investments in strategic workforce planning in terms of big data gathering, sector-wide intelligence sharing and all sorts of other things that a lot of us are probably currently not doing.
Okay, just to finish off, my main message today is we need to empty the value of cheating from our courses. Doesn't matter how students cheat, whether it's using generative AI or contract cheating, the value of cheating on our courses at the moment is very, very compelling. And we need to empty that out. We can't secure everything.
This is another big message we get from Phil Dawson. And no task can ever be completely secure. And this is my other big message. We need to start with stopping. We cannot ask academics to do anymore right now.
And if something is futile, trying to do it harder doesn't make it any less futile. And the analogy I'm using is trying to grasp or pick up a jelly.
It is impossible. It's futile trying to do that. It's, it's futile to try and secure online exams, but more and more people are trying harder and harder to do it harder and harder to grab this jelly and it's just turning into a big sticky mess.
So thinking about where we put our energies, it's much easier to secure the learning and the knowledge on the right-hand side of this spectrum than it is on the left-hand side of this spectrum.
So maybe we should give up on trying to secure the staff that's factual and [inaudible] and focus instead on trying to secure stuff that is procedural and metacognitive. Um, actually I'll just jump across that one. Um, I do think we're going through a paradigm shift, a paradigm change.
And I've gone back to Thomas Kuhn's work to revisit it. Um, he did doesn't actually use the word paradigm shift, he uses the word scientific revolution. But one of the things he says is, we know we're encountering a paradigm change when we're confronted by, um, by basically by questions, by anomalies or counter instances or questions we can't answer.
And I think that there are two critical questions that we can't answer at the moment. The first is how stop students from cheating? And the second is, how can we be sure that our graduates have learned what we need to be safe and competent professionals? Now in effect, if we can't answer those questions, this is what we are saying, we cannot graduate, uh, guarantee that all graduates have met all the required learning outcomes for their program of study.
And we cannot guarantee that our graduates have not cheated on some, most or all of their assessment. And I put it to you, what can our brand, our HE brand tolerate?
And I'll also just leave you with a, a quick plea that we need to introduce critical AI studies into our work. We need to look at the neo-colonial exploitation that goes into building these tools as well as the social media tools that we use.
We need to think about the carbon cost, we need to think about cybersecurity and Samsung learnt that to their peril.
All of these tools have ingested bias and there's also serious concerns about IP and copyright from artists and from indigenous peoples in particular. And these tools, by their very nature, look backwards, not forwards.
That's not what our sector should be doing. So I'll end it there and pass back to the host.
The link between cheating and assessment
Professor Cath Ellis, University of New South Wales (Australia)
Video transcript
Download the presentation - Generative AI – the issues right here, right now [PDF, 4.3 MB]
Speaker: Professor Margaret
Professor Margaret Bearman: Thank you so much. I’m hoping everyone can hear me. I'm going to, I’ve got a slideshow, so I’m going to share that now. After all these years on Zoom, I still often get this wrong, so let’s hope it works.
Oh, and it’s not going to start in the right spot, right, that’s better. That looks right to me. So I’m just going to kick off and hope that’s okay.
I’m talking about generative AI, the issues right here, right now. But I’m also going to talk a bit about the future as well. I want to speak from an assessment design perspective. Thanks to Lenka and Cath, that was great.
I’d like to commence by acknowledging the traditional custodians of all the unceded lands, skies and waterways in which Deakin students and teachers come together. I really like this acknowledgment of country, which is why I’m saying it in full. As we learn and teach through virtually and physically constructed places across time, we pay our deep respect to the ancestors and elders of Wadawurrung country, Eastern Maar country, and Wurundjeri country, as well as the traditional custodians of all the lands on which you may be learning and teaching, where education has taken place for many thousands of years.
So in that cast to, you know, in this time of revolutions and paradigm shifts, I think it’s also really important to take pause and say, you know, there are other traditions where education has happened and will continue to happen to acknowledge that.
So the purpose of today, I just want to shine a little bit of light on the implications of generative AI for University. I have put university assessment in brackets, but I think you can take from that from school. Education is what we talk a lot with school, college. And I’m going to start with implications of the short term and I want to zoom out at the very end to implications for the longer term.
And I want to start off with defining assessment as graded and non graded tasks undertaken by enrolled students as part of a formal study with the learner’s performance judged by other teachers or peers. And the reason I want to use this definition, I’m going to put it front and centre, is because I want to not just focus on the moments of assessment where we grade students.
There are the moments of assessment and the teachers are not the only ones doing the judging. In other words, I want to return us to the inherent tension in assessment. I work in assessment, in assessment design, and one of the key things, the interesting things about it is it has many functions, but there’s this really big function between we needed to assure learning and promote learning.
And what I really want to, I mean, that’s probably familiar to everyone in this room, educators. But I want to say that in this time we often get really caught up in the assurance, and I want to pull us a little bit back to the promoting learning function as well, because in any assessment design, there’s always a trade off between these two things, and ChatGPT and other generative AIs etc. are exacerbating that tension.
So I’m drawing here from the work of colleagues Jason Lodge, Jack Broadbent and Sarah Howard and they’ve written this cute little, it’s a great little LinkedIn post about how educational institutions are responding to GenAI.
And they’ve really got 6 categories and I think they’ve nailed it. Ignoring it, banning it, invigilating, embracing, designing around and rethinking. And I think that we can agree that the first two, sort of, of Cath’s slides allude to this, that ignoring it’s not really going to work. We really have to say, look, this is out here and banning it isn’t really working. We know that students are using it irrespective of bans and it’s been banned.
I’m actually confused where the banning is going on in secondary schools. It’s been somewhat banned in different states in Australia, and I think it’s been unbanned as well, on and off, private and public are doing different things. So banning it, it’s confusing and I don’t think it’s going to work. So let’s put those two to one side.
That leaves us with 3 responses better in the short term, invigilate, embrace, design around. I want to talk about those and then come at the end to rethink. So should we embrace or design around or invigilate? That’s really the question.
Well, embracing, I’d say embracing is inevitable. It’s going to be in our enterprise software. Launch of Bard means Google’s transformed, Microsoft is not far behind. So in our day to day things that we use in our institutions, generative AI will be there, but it’s still really uncertain. And I think this is going to go on for a year or two at least.
So some of the uncertainties around ChatGPT and other GenAIs, and I’m putting Chat here specifically because I know the most about it. But, legal uncertainties, who owns the promise, who owns the source material? Is copyright being contravened? There’s a lot of big question marks about that corpus and how it’s being used. So that’s, you know, I and there’s going to be legislative arrangements that are going to come into place on top of this.
There are ethical uncertainties. There’s issues of bias in the corpus, issues of truth, epistemic colonialism. There’s all sorts of things going on here ethically that we’re still feeling our way through. The access issues without enterprise models there’s cost concerns. Can we assume that everyone can afford it? What if it falls over during assessments in the way that we at the university tend to have great deal of confidence in the platforms we ask our students to use?
You know, how can we, how can we in the embrace situation, guarantee what we’re doing? And most significantly, we don’t quite yet know how anyone’s using it. It’s still really new. People are starting to experiment with it in professional workplaces and students and educators, but it’s still settling on. Particularly as the software develops and different version comes out. So to a certain extent I’m going to say that large scale embracing is very difficult to do right now. I’m not saying small scale, but large scale embracing particularly in light of those legal and ethical uncertainties.
So we move to design around at task unit level, and I think it’s the most sensible right now option for many assessment tasks. And this means at a small scale level, people may embrace because it’s what’s happening in certain disciplines already. We know that software companies are really using GenAI codes. It’s going to be reflected in the disciplinary nature of it.
More likely what’s happening is everyone is concerned and at a unit level. And I mean, that’s the enterprise stuff. I’m talking smaller scale, the assessment design level, concerned about inappropriate use of AI. So people are trying to shift their tasks to try and avoid student passing off AI work as their own.
And yes, if you’ve got an essay about the trolley problem in philosophy that you’ve been using for 20 years, it’s going to be a problem. This is a general proposition. Lots of suggestions, little evidence, yet. If the knowledge is common, then the tasks integrity is likely under threat. And most suggestions of change, changes to assessment, adjust, adjust things around the commonality of the knowledge the student has to represent knowledge. It’s common, not commonly available or even know, and that the advantage of this is it doesn’t interfere with the purpose of assessment to promote learning. Well, while some approaches will sort of, you know, focus on the assurance rather than the learning.
So these are the sorts of possibilities, you know. Ways of making knowledge requirements more specific? So to lean into the relational. Do you know your students? If you know your students and they’re producing sort of odd work, is that something that you can pin to particular time and place and it can alert you to something.
Might be, you know, more in-class work or synchronous work in any way, specifically requiring these assessment tasks to reference something that happened in class, rewarding originality, something that no one has ever done before. It can’t be found. Making the task more authentic, designing something into a specific time and place, making sure the rubric rewards it situational relational success criteria.
Now these are in no way cheat proof. I’m not suggesting that the influencer on TikTok couldn’t get around these, but I’m going to suggest that from an assessment design perspective, we can’t be nailing it to making things cheat proof. Intentional cheating is very pervasive and very, very pernicious. You know, cheaters are going to cheat. And what I think we want to do is make it difficult. We want to, in fact, possibly even if people are going to start to cheat, they’re going to have to actually learn in the process as well too. Not ideal, but those are the sorts of ways, I think, to frame it.
So let me come back to the next question. What about invigilation? Well, what’s wrong with invigilation? Well, as Phil Dawson, who Cath mentioned, points out, the work of Brett Hackett shows that cheating still goes on, possibly at high rates. So invigilation does not stop.
To the best of our understanding, cheating in exams happens a lot and there are many negative effects to invigilator timed exams. It’s costly, it’s stressful. It tests capabilities unrelated to tasks.
Say, I’ve got a child doing sort of year 12 in Victoria right now and gee those tasks seem to be a lot about good handwriting and being able to do something in 45 minutes. And I don’t know if those things really relate to his understanding of English or history or other things, and they’re problematic in terms of diversity and inclusion, and only a narrow band of capabilities can be tested.
We’re saying that you have to only do those things in this very short period. For example, we could never set, not that we would, but for example, a novel cannot be written in 45 minutes under exam conditions. There’s a whole lot of things we’re automatically excluding. There’s lots wrong with invigilation.
So, rethinking invigilation. Now moving to the rethink may be okay. So here’s some early thoughts. We’ve got something out floating on our CRADLE website around some of these sorts of ideas and this echoes a little bit from what Cath was saying, prioritising what needs to be invigilated across a program. Do we need hurdle tasks? Say this is back in university first year to say these are the skills that you need to do to the other things. A
lot of opportunities to demonstrate knowledge, engage in feedback, without invigilation. It’s up to the students. And then at the point of graduation we come back to a little bit of invigilation. Where outcomes must be assured and the invigilations might move towards orals, again problematic in terms of people freeze, but maybe they could be dialogic rather than surveiling. Maybe the whole point of these orals is not really to show how much you know, but just you know a little bit about what you’re talking about.
PhD defence. I’ve been seeing a few of those in the European systems. It’s you’ve passed you’re not really going to fail, but if you really don’t really sound like you know anything like it, there’s not a link between you, the person in the work that you did that’s a problem. And then a move towards assessment of learning outcomes across tasks rather than just within them. So at the moment we say, well, you can do X here and we tick it off, but what if we need to say X, Y and Z in this, in this essay, in this oral, in this moment. You’ve demonstrated sort of these higher order capabilities here.
One of my most favourite topics, and I think this is something you really need to do, is rethinking your curriculum to account for AI and one of the things I think that we need to think about is standards. And this is where a valid judgment comes in that Cath was talking about what counts as good? We have this sort of idea about machines that they always produce accurate responses.
It’s where a calculator comes in. I think we need to think about genAI more like that that uncle that you have that talks large but actually may not know what they’re talking about. We need to be able to unpack what it is that that people are saying and to see that there are absolute gems in there, but there things that may not be also be right. And we need to all of us, and our students start to attune to what good looks like. Because that is also where we deal with things like ethics and so forth as well. What do we want to count?
So conclusions, sorry, rapid gallop through all of this. Ignore, ban, invigilate, embrace, design around, rethink, some thoughts for you. Artificial intelligence has already made huge inroads into our society. It remains an evolving and uncertain presence.
And I would like to point you to the fact that I think that AI has been here for a long time as well too, Google is powered by AI. We have a lot of shaping going on in our world that we could usefully attuned to at this moment how we choose to address its presence in our assessment designs requires thinking broadly and not narrowly.
And I just, my last plea: assessment is not just about testing. It’s always an intervention into learning. So whatever we do in our assessment designs will affect how students learn. And I think our lesson time for questions, but I don’t know whether that’s going to fit with this agenda. So thank you very much.
Speaker: Tēnā koe. We thank you, Professor Margaret. Unfortunately, we do not have any time for questions, so we will just move on and carry on. Tēnā koe.
Generative AI - the issues right here, right now
Professor Margaret Bearman, Centre for Research in Assessment and Digital Learning, Deakin University (Australia)
Video transcript
Speaker: From the USA, please welcome Associate Professor Jason Stephens.
Professor Stephens: Greetings. Kia ora koutou. Great to be with you today. Thank you.
I'm really enjoying these talks and wondering where the clicker is. Yeah. Thank you. Uh, yes. It's an honour to be here. And it's been a fascinating set of talks. I feel like what I, what I have to say is a little bit more old school and a little bit slower pace. I seem to somehow, I feel kind of like an old school voice in this talk of people that are really looking at how to embrace this. I'm not an expert in AI or assessment, for that matter. I'm somebody who has really studied more human learning and human behaviour and think about academic integrity from that perspective, and of course, have concerns about Chat GPT and other forms of artificial intelligence and what they mean for what they mean for us in terms of assessment.
So with 10 minutes time, I'm not going to be able to get too much into any one thing. But I wanted to talk a little bit about, you know, the importance of integrity and what it means to, I like to say, achieve with integrity. Um, and what are the obstacles or threats to that? You know, in, particularly, you know, part of that is our human nature. Part of it has to do with the power of the situation. And then, of course, the interaction of those two things. And then I'll conclude with a couple of brief thoughts and maybe not take your questions.
So I'd like to begin this with a few propositions, you know, some premise and conclusion that, you know, being honest isn't easy. I think that's sometimes counterintuitive for people, they think that, you know, that we're just born good. And the reality is, is that well, no. I mean, I think we have both goodness and badness in us. That is, we have the tendency to deceive, but also to be champions of truth. So both of those are true. Right. But being honest is not easy in all situations. And that students need our support, sometimes a lot of it, to be honest in their assessments and to achieve with integrity.
A lot of them are achieving, of course. Right. And doing well. We know that. But are they achieving right in a way that is honest and true to what they know to be the right way to do it?
And so that would lead to a conclusion then that we, and I say we, I mean everybody in this room is what, if you pay attention to nudge theory at all, is a choice architect. You know, whether you're a lecturer, a course director like me, a programme administrator, a policy maker, you know, you are, you're the one that has the power and the control in how you design your environments, how you design your assessments, and take advice from all those, you know, really informative talks we've just had about how may be best to do that to help mitigate dishonesty.
Yeah, it's still going to happen to some degree, but our, you know, our job as choice architects is to create environments, right, where students want to learn. And that assessment, as Margaret just said, is not only assuring learning is taking place, but also promoting that learning, inspiring further learning.
And so, some references there with respect to where I get some of my ideas. But so what does it mean to achieve with integrity? What I'm talking about simply is that not just about being honest. That's certainly part of it, right? Is that the work is honestly there, it represents them, you know, what they've done or what they know. But, for me, as somebody who is an educational slash kind of developmental psychologist, I'm interested in how, you know, there's integration that is between my thoughts and my actions.
And why I get concerned and why I've studied academic integrity or misconduct is when it's disconnected. When people are engaging in behaviours that they know to be wrong and they're not achieving with integrity. I think that's a problem for them as individuals and it's a problem for us in society.
There's a broader sort of model that I have with this and written about, and I've done empirical studies testing it, but there's sort of this high road of that we want students to be on. We want them to be aware and understand the ethical and moral implications of their actions. We can talk about academic cheating being talked about this more broadly to involve or in all of our kinds of behaviours that we engage in, and that we act.
You know, we judge and act in principal ways. And but there are points of deflection. And as I say, I don't kind of reference them here, but they're certainly empirical studies, my own and others on, you know, where things go wrong on this path and how we fail to achieve with integrity.
So it might often look very simply like this: is that I have a judgment about what I should or shouldn't do, but I find myself acting in another way. And this is known as the Judgment Action Gap in moral psychology.
And each of us experienced this gap every day, not only in things that we actively do, but things that we fail to do, right? We know that maybe we should be helping out a neighbour or doing something for a family member or some other way. And we're failing to do that. And this judgment then is, you know, a necessary component of what's called ‘Moral Functioning’, but insufficient. So knowledge alone is not enough.
And we see that, you know, even in the use of technology. Now, this is a somewhat recent study. There's many now coming out, right, about students' use of ChatGPT. This one from the United States involving a thousand students across the school year and, you know, 30% saying they've used it, right, to write, to do some of their own work. And three quarters of them, you know, doing so, knowing that they shouldn't be doing so, right. I'm not concerned as is as Phil Dawson, you know, Deakin says, you know, that's contextual, socially constructed is absolutely going to be, you know, proper and good uses of AI, you know, in the classroom and beyond, of course.
But here, we're talking about people using it when they know they shouldn't be using it. And that's what concerns me and that's what I'm talking about, is not the good and productive uses of AI. Those are plentiful. But there's also this sort of more dark side when people are using it when they shouldn't be using it.
I'm part of a project here and I see Neil in the audience and it involves seven tertiary institutions across New Zealand. The first national, really, student survey of academic integrity. And we asked them about 27 different behaviours. We don't call them cheating. And only one of them was that we, this was about a year ago when we designed the survey and we said “Oh, let's put a question in about artificial intelligence”. You know, we wouldn't even know about ChatGPT. So the item doesn't say ChatGPT. It just says “Did you use artificial intelligence such as an online text generator in the past year and submit it as your own?”.
So this survey was completed by these 4000 students at seven institutions in September and October. Few went into November. So it was actually before the release of ChatGPT and I’ll share here, we got 15% of that 4500 students who said that they did so at least once and didn’t to put down their institutional ranges. So at one institution it was as low as 6% admitting and this is a self-reporting so usually can usually can double those numbers, as high as 20% at a, you know, at another institution.
I would guess that by October, if we did this survey again this year, we plan to do it in three more years. But I guarantee you that number is going to double or triple. And so that should concern us and we need to. Yeah, as the previous speaker just said, we need to start really rethinking and redoing and redesigning if we're going to, if we're not going to have students doing something that they themselves know is wrong and they shouldn't be doing.
And that does get into this issue of, you know, of, you know, the validity of our assessment, I'll say that in a moment. But to be clear, you know, this use of ChatGPT is just one form of misconduct. If you look across well before, you know, artificial intelligence came online, you know, students had been engaging in various forms of misconduct since assessments began thousands of years ago.
This is just a range of studies and the bottom figure is the overall percent. And it's exactly what we found in this range study I just mentioned. You know, that at any institution, the range is right around 60 to 70% saying that, you know, they did something in the past year that would constitute cheating.
And so the problem is bigger than AI, of course. AI just happens to be the most powerful and efficient tool for cheating that's ever come along. And really having. And it is a revolution, a paradigm shift, and are going to have to really rethink about, rethink how we do assessment and what we're assessing for. Again, beyond our scope here.
But why is that? You know, why is it a problem? Well, of course, you know, it decreases, you know, when students are cheating and using, you know, tools, technology they shouldn't be using, you know, it's decreasing their own engagement, learning and achievement. It's compromising, you know, somebody cares about ethical, you know, ethics and morals, you know, is compromising on that development and their integrity, you know, at an institutional or an interpersonal level, it's creating this unfair advantage. Right. And in affecting the wellbeing of other students, causing, you know, stress and pressures. And this contagion effect that I'll mention in a moment.
And at an institutional level, it's invalidating our assessments. Right. And misleading others, potential employers even. Right. In terms of what our students know or are able to do, that's problematic. And ultimately, you know, perhaps devaluing our reputation, degrading degrees, you know, that brand issue, you know, comment that was mentioned earlier. So these are things we have to, you know, 3 classes of harms I think that I worry about when I think about, yeah, integrity problems on an assessment.
Okay. Just to sort of wrap up. Well you know why, you know why? What are the obstacles? As I mentioned before, you know, academic, you know, cheating itself, you know, is natural. It's something that, you know, is a product of evolution done by all plants and animals. Engage in some form of deception, right. So it's sort of there. It's normal across species, including the human species. You know, kids as little as two years old start to lie.
Again, I can't get into all the details of this, but I would refer you to the article or even on YouTube. I did a nice summary of this at a code last year. But it is also unethical, right. It's unfair and potentially harmful, and evitable, all right. It's not, you know, cheating is not an inevitable thing, right. It can be prevented. And that's about sort of creating context.
The problem is... Well, let me skip that right there. This gets into more details. The other obstacles, you know, include that the reality is, is that thinking is costly. Right. And it takes a lot of energy and its outcomes are uncertain. So it makes cheating becomes very tempting then.
We're not really programmed to think, right. Our brain, we evolve to basically remember and react. Thinking is not what we do best. It’s time consuming and a lot of energy and risky. And so we're actually more or less a risk averse species. And so we find ourselves wanting to find that more “If it's going to save me time and it might help me get a better grade than I can do on my own”. That makes it a very tempting prospect for students.
And of course, you put that, you know, into the into modern society where what we really care about is optimising for speed and convenience, then that's, you know, that itself can lead to this culture of cheating where we have weak norms and even poor detection. You know, the potential reward becoming greater than the risk or the cost in this contagion effects. Right. We see other people doing it. We feel like, you know, I'm a fool for not doing it. Either you're a cheater or a patsy, you know? Right. You know, you're cheating or being cheated. And you get that kind of mentality and all that is, you know, psychology, social psychology, the power of the situation.
Put all that together, then on top of your own time constraints, you know, desire to please your parents, maybe, you know, or even your teachers and do well in, you end up with a lot of challenges to our integrity and to the prospect of achieving with integrity.
So what do we do? You know, I, for one, think just like these founders of AI and the tech people saying, let's slow down and maybe stop breaking things for a minute. Right. I don't think we really have a pause button we can hit here, but we do need to, to the extent we can, slow down and think about what we're doing and not just breaking things. Productivity and efficiencies are great, but we also have to let them not outweigh integrity and ethics I think.
I also think even Ivan Illich's quote about, you know, that we need tools to work with rather than tools that work, you know, for them or, you know, that work for us. That's a trickier one here because if AI can do a lot of things, a lot of work for us, it's going to be great and important.
But still, you know, as educators, we're in this business of trying to enhance students' competencies in their capacities and to make sure that they possess those. And so we, yes, we need to train students to use AI and so forth, but we don't want it to, how do I say it here? We want to enhance their learning and not let it be sort of replaced or lower or at least technology used to misrepresent what they know or are able to do.
So I'll end there. Thank you very much.
Speaker: Thank you again Jason. Ka pai.
Achieving with integrity in academia: The aspiration and its obstacles
Associate Professor Jason Stephens, University of Auckland
Video transcript
Download the presentation - How AI is impacting my school [PDF, 2.1 MB]
Speaker: We've reached one of our last speakers as part of the session before we go into some table activity. I would like to now introduce the kaupapa which is called How AI is Impacting My School.
We have Kit Willet, who is an Auckland-based English teacher, poet, and executive editor of the New Zealand Poetry Journal. We are also joined by Claire Amos, principal of Albany High School. Claire is co-founder of Disrupted and sits on the board of NetSafe NZ and 2IC Skills Lab.
Claire also enjoys contributing to a wide range of advisory boards and reference groups and was a founding council member of the Educational Council of Aotearoa. Claire who is passionate about education, tattoos, living by the mantra 'You can never be overdressed or overeducated.” Please welcome both Kit and Claire.
Claire: Looks like I'm up here. Awesome. Should I just turn it off? [Greeting and introducton in reo Māori]. Um... I'm going to fly through these slides because I could talk and talk on this topic, and I know we have to share this space fairly.
So, my context: I'm the principal of Albany Senior High School, which is on the North Shore of Auckland. We have 900 students and are a senior-only school. We pride ourselves on being incredibly inclusive. I always talk about our school being a place where you belong exactly as you are. Our curriculum model focuses on tutorials, specialist subjects, and impact projects. We are an innovative learning environment with a school-wide focus on universal design for learning, responsive assessment practices, and self-directed learning. We strongly believe in using technology to amplify learning. We have a strategic digital strategy, and our guiding mantra is: “It’s not if you’re bright, it’s how you’re bright. No one slips through the cracks, and we will always be a new school.”
Our approach to AI aligns with the model presented in our pre-reading. We aim to embrace and rethink. Our approach is one of exploration and curiosity. We accept that AI is here to stay. We want to embrace its potential and rethink our approaches to assessment. We give our teachers plenty of time to experiment and engage in professional learning to understand different AI tools. We are excited about how AI might reduce workload, help us work smarter, assist with inclusive and universal design for learning, and support learner agency and self-directed learning.
We are also considering how to address ethical issues. We want to think about the digital divide, who has access to these tools, and who is learning to use them effectively? We want our young people to be aware that AI often reinforces bias. We see it as our responsibility to teach critical thinking and encourage working smarter, not lazy cut-and-paste practices. We address plagiarism concerns by knowing our learners and having open dialogue about how we use these tools, including verbal and oral checks and balances.
I asked my teachers to share how they are currently using AI. They are using it to work smarter and in interesting, critical ways. For example, in photography, students create images in Midjourney using descriptive language from another artwork to test how well they described it. Teachers are using AI to design assessments that are less likely to be plagiarised. They ask ChatGPT how to make topics more relevant and interesting. AI is also used for parts of student feedback and assessment. It is not replacing teachers, but it is helping provide better, quicker feedback to more learners. Teachers are testing boundaries and exploring different uses.
Students are using AI in partnership with teachers. They discuss where it might be useful and how it can help them work more effectively. One day a week, students engage in impact projects using design thinking to solve problems and meet stakeholder needs. They find AI useful for ideation and solution development. There are active discussions in classrooms about the power of AI for good and how to use it without cheating themselves out of learning opportunities. We encourage students to use AI as a coach or tutor, to critique their work, and to compare their writing with AI-generated versions.
As a school leader, I worry that the overcrowded educational landscape means AI is easily ignored or sidelined. We are overwhelmed by industrial action, curriculum refreshes, and other demands. Where is the cognitive space to engage in robust conversations about AI’s role in education? I think we risk missing important opportunities in the next few years. Schools that ban or try to control AI cannot see the wood for the trees. I chuckled when Chris Hipkins said Labour was not using AI in their campaigns. If you turned on a computer, chances are you used AI. We do not know how much we are using it. Trying to control it is delusional. We need open and frank conversations.
Banning AI prevents teachers from using it effectively. They miss opportunities to reduce workload and enhance inclusive teaching. I am excited about how AI can improve teaching and learning. But I am also concerned about the digital divide. Some students are supported to use technology effectively, while others are not. If AI is not embraced, who will teach students to use it critically? I do not want TikTok to teach them. Schools must take an active role in teaching critical thinking and highlighting bias and algorithmic echo chambers. We must support and encourage students to use AI for good.
If we do not evolve assessment, it will become an exhausting and futile game of whack-a-mole. My ponderings: How might we promote a principal-led rather than policy-led response? Is it time to trust teachers to assess based on their judgement and knowledge of learners? Do we even need formal assessment in schools? Are we clinging to an outdated measure of success? What is formal assessment really about? Is it about protecting status or a lack of trust in teachers?
Would it matter if we could no longer rank and credential learners? If formal assessment stopped tomorrow, what would we lose? Who is most threatened? What does that say about why we value formal assessment? Our young people on health and safety plans due to suicide attempts and self-harm are not worried about this. Perhaps they are in that position because of the pressures of formal assessment. People scraping by week to week are not worrying about this topic.
What if students created portfolios of learning that were not compared to others? What if learning was enough? Universities could find creative ways to determine readiness. While assessment is important for measuring learning, perhaps we are too attached to traditional forms. Most of all, I want us to engage proactively in these discussions. The future is not something done to us, it is a process we can shape. If we ignore this, we are not intervening. The more actively we engage and prioritise these discussions, the better. That is all I have to say. Thank you.
Kit: [Greeting and introduction in te reo Māori] Kia ora. My head of English has had more plagiarism on her desk in the last term than in the last two years combined. I had a student sacrifice 14 NCEA Level 2 credits, including university entrance literacy, by using AI. I had explicitly said, in a supervised bookwork model, that this was not allowed. It violated our academic integrity standards. To award credits, I need to prove the student can research, evaluate, and report information. He did not read the assigned text and plagiarised. The second time was for the writing portfolio, an opinion piece. He did not ask for help and said he ran out of ideas.
Our school has a more traditional approach to assessment. We have returned to handwritten responses. Where special assessment conditions apply, we make accommodations. But for most assessments, I expect work to be produced in class under supervision. I expect to see bookwork so I can track progress. The student who plagiarised did the work digitally. I used key tracking software on Google Docs to compare drafts and identify copied text.
As an educator and assessor, I have to go through these processes to build a case for plagiarism. I do not enjoy it. I do not want to police this. But under NZQA, I must gather evidence before making a plagiarism claim. I then conduct an oral assessment to see if the student can demonstrate the skills. We may allow a resubmission. But ultimately, the evidence shows the student took a shortcut and does not understand the material.
It really disappoints me when students cheat and use AI to shortcut learning. AI has so much potential. I want it to be part of our teaching and learning practices. Yesterday, I saw a student on a ChatGPT New Zealand teachers’ page. He and his team are developing an app for high schools. Teachers input marking criteria, and the AI identifies what invalidates the assessment. It supports students through critical thinking and dialogue without giving away answers.
For example, it will not tell students how to capitalise a sentence, but it will point out if they are not doing it correctly. In Year 13, I tell students they must know where a capital letter goes, on a proper noun and at the start of a sentence. That is the level of detail I need them to know when they leave school. AI is currently getting in the way of that. Without guardrails, students rely on AI to do the work. They lack the skills and content knowledge. I find that problematic.
I believe we can add guardrails so AI can act as a teacher or tutor. It can support students through critical thinking, source evaluation, grammar, accuracy, and text interpretation. I am doing that now, but students are not asking for help. They may ask AI because it feels less pressured.
Finally, I want to mention the amazing possibilities of AI for educators. My department had a three to four-year project to create an exemplar bank from Levels 3 to 8 of the New Zealand Curriculum on one text. I completed it in eight hours using GPT-4. Incredible technology. I want to use it and enable students to use it. But we must rethink assessment practices and identify the key skills we want students to know and use. For me, that is grammar, accuracy, and interpretation of text. Others may not value grammar, but I do. That is all from me. Thanks.
How AI is impacting my school
Claire Amos, Albany Senior High School & Kit Willett, Selwyn College
Video transcript
Speaker: Kia ora tātou. I'm hoping that you all enjoyed that beautiful kai that was provided for us here by [unclear] for lunch. I know I'm feeling very full over here. Anyway, that kai will get us through this last afternoon.
I don't need to go on too long because I've already introduced this next person, as we've already heard from him this morning. So we're going to move into the Perspectives on AI panel. Jason Stephens will be the convenor for this after-lunch panel and will introduce our students, who will be joined for the session together with Jason, Kit, and Claire. So if I can please ask Jason, our student representatives, as well as Claire and Kit, to please come to the stage. Kia ora.
Jason: All right. I hope everybody had a great lunch. Welcome back. I don't know how I got so lucky, but I do have the honour of convening this panel. I've met, well, not everyone, but made some introductions. So I'm going to allow them to introduce themselves first before I start throwing some questions at them and letting whoever would like to respond, respond. Maybe we'll have some time at the end for the audience to quiz our panel, perhaps.
Without further ado, do we have a mic? Excellent. Why don't we start? We've got two of them. All right, let's talk one at a time. Maybe we'll start with you. It's Jonah, right?
Noah: Noah.
Jason: Noah. All right, that works.
Noah: Hi, my name's Noah. I'm Year 12 at Scott's College. I enjoy playing basketball.
Student 2: Kia ora koutou ko [unclear] tōku ingoa. [Greeting and introduction in reo Māori] I'm a Year 12 at Palmerston North Boys' High School.
James: Good afternoon everyone. My name's James. I'm a Year 13 student at Wellington College and a digital technology as a science student.
Aaron: Hi, I'm Aaron Lowe. I'm from Wellington College. I'm a Year 13 student.
Leon: Kia ora koutou. I'm Leon Bowie. I'm a student at Victoria University representing tertiary students.
Allen: Kia ora koutou. I'm Allen Dixon-Taua. I am the National President of the New Zealand Union of Students' Associations and the Pacific Representative for the Steering Committee of the Global Student Union.
Claire: We've already introduced ourselves. I'm Claire.
Kit: I'm Kit. Hi.
Jason: Excellent. Thank you all for being here. We've heard a lot from the experts this morning on artificial intelligence and assessment, and a little bit about what we know and what we don't know going into the future. So this panel offers a nice opportunity for insights into what's really happening day-to-day in classrooms and among students.
My first question is for the students. I want to get your perspective. In higher education and as a teacher interested in academic integrity, it's almost not an hour goes by without some update about artificial intelligence and ChatGPT. It's part of the daily discourse. Is that true on the ground in schools? Is this something you're talking about a lot as students? And if so, what are those conversations?
Noah: From my perspective as a Year 12, especially among students involved in co-curricular and academic activities, it was sort of a rage at the start of the year, around late January and early February. Everyone was using it, trying to take advantage of it, and then it sort of died down. But it's still there, just in the background. It's still spoken about a lot. I know students still try to use it in internal assessments, essays and so on. It was big at the start, then faded, but it hasn't gone away.
Aaron: From our perspective, we don't normally talk about it on a daily basis. It's not something actively discussed. Of course, when an internal is handed out, you get the jovial reminder from a teacher, "Don't go and use ChatGPT." But it's not something we're actively worried about or using for serious academic work.
James: Just touching on what Aaron said, yeah, at the start of the year it was a big thing. ChatGPT came out over the summer break, so everyone was excited. It was like, "Oh, we don't have to do any systems this year." As the year went on, around Term 2, everyone still uses it, but the hype is gone. It's not new anymore, so people don't talk about it as much, but usage has gone up.
Jason: On that use, is it being used to cheat or in productive ways?
Aaron: Personally, I use it to help structure things and plan out ideas. I'm not great with details, so I use it to help myself. But I do know people who have used it for entire assessments, just because they couldn't be bothered. They'd rather hand in something than nothing.
James: At the start of the year, some kids tried to use it to write exams and stuff. It depends on the type of person. Some use it last-minute to cheat, others use it as a resource or tool to find sources or information that's hard to access. It depends on the student and the subject. For example, English is easy to exploit with ChatGPT, but Te Reo Māori is harder due to dialects and nuances.
Noah: I've used ChatGPT myself. I put in an essay topic from a previous year and watched it spit out a 2000-word essay in minutes. I don't think it could be used to hand in work because at Wellington College, we have a feature in Google Classroom that checks work for AI-generated content. AI has a distinctive writing style, and teachers who know their students can easily detect it.
Aaron: I've used ChatGPT to reword assessments, to summarise them. I get bullet points to know what I need to write for an essay. I know people who have submitted verbatim work from ChatGPT and passed it off as their own. It wasn't successful. Some classes have updated policies to ban ChatGPT use for submissions.
Jason: It sounds like there's a range of uses, some legitimate, some not. On this side of the panel, we have tertiary student representatives, a principal, and a teacher. Has ChatGPT changed the nature of your conversations?
Allen: From a student association perspective, it hasn't changed the type of conversations. We've been calling for more digital engagement from the tertiary sector for some time, especially for disabled students. Students are working 20 to 40 hours per week due to the cost-of-living crisis. There's always been a call for digital integration. During the pandemic, we saw how far behind we were compared to EdTech. AI could be explored for refugee students, ESOL students, and neurodiverse learners. We should also be asking questions about originality in research and ethical parameters. We're not anti-AI, but we want more exploration.
Leon: Students are using AI daily. It's not a massive conversation, but it's happening. Some courses encourage it, like game development. Others ban it. Students use it for recapping, preparing for tests, and reviewing content. It's especially helpful for disabled students. But tools like Turnitin claim to detect AI and often produce false positives. Students are being wrongly accused of plagiarism, leading to emotionally draining appeals. Institutions banning AI without proper understanding is causing harm.
Claire: You made my heart sing when you talked about using AI to summarise assessments. You're setting yourself up as a lifelong learner with learner agency. My concern is that fear in the assessment space will prevent young people from using AI in this way. What can educators start doing to help you use it better, and what should they stop doing?
Leon: First, stop banning it. It doesn't work. AI is integrated into everything now. Banning it is like banning computers. Instead, explore it, especially for recapping and reviewing content. Slowly phase it in. Banning it is ineffective.
Noah: Education around AI is key. If teachers saw the presentations we've had today, they'd change their outlook. Some traditional teachers see ChatGPT and assume it's cheating. They don't understand its potential.
James: Teachers fear AI due to lack of understanding. If they grasped its power, they could use it to find teaching tools and improve their practice. Listening to students would help. Conversations between students and teachers could lead to mutual understanding.
Aaron: Banning it is the wrong approach. Teachers may be influenced by negative media or lack technological understanding. Working with students to find educational benefits is better.
James: I've used AI outside school for programming. It can generate code based on what you've written. It's not cheating, it's saving time. Discouraging it is the wrong approach.
Allen: We've had similar issues in tertiary education. We had to determine when a student added to AI-generated code whether it was their own work. The answer was yes. Technology is asking us how we engage with it as humans. The conversation needs to move from integrity to ethics and class division. If you don't bring technology into the classroom, students who can afford it will cheat. You're reinforcing inequities.
Leon: In ICT, there's a piecemeal approach. Different schools and universities have different rules. We need support from NZQA and government to unify standards and definitions. Consistency is key.
Jason: I think we all agree that AI is here and changing our world. But when it comes to assessing learning, is it okay to ban the tool in those contexts? Is it valid to assess what students know unaided by technology?
Leon: No. There's no job in the real world that doesn't involve technology. It's been here since the 1970s. Assessing without technology doesn't work.
Jason: But is there value in assessing what people can do on their own?
Claire: We need to be clear about the purpose of assessment. I want to support young people to succeed beyond school. I don't care if they use technology. I want to know how well they can use it and think critically. We're measuring the wrong things. I don't care how well a student can rote learn or perform under pressure in a 45-minute exam. Formal assessment is no longer measuring the right things.
Jason: What happens when the power goes out and your doctor can't perform CPR?
Allen: We need to reconsider what knowledge and learning mean. Technology is knowledge. Education revolves around technology. It's about shaping pedagogy and curriculum to ensure students have the skills to progress.
James: It depends on the job. For medicine or engineering, you need to recall information and make decisions without AI. But for art, coding, or creative fields, using AI is a better measure of knowledge.
Noah: Human growth is exponential. AI will continue to improve. Future developments will be done by humans and technology together. We need to learn to coexist.
Aaron: It's almost impossible to conduct assessments without technology. Most knowledge comes from digital sources. Assessing without assistance is unrealistic.
James: Assessing raw skill is important, but so is assessing how students use tools. For example, knowing how to perform CPR is essential, but using AI to learn the technique is also valuable.
Kit: In nursing, professional standards require integrity and recall. Students need to retain and recall information. We need to practise in fabricated environments to prepare. When are fundamentals mastered?
Leon: In medicine, yes, you need to recall information. But that doesn't mean all subjects should be assessed without technology. We need specialised approaches for different fields.
Leon: Fundamentals are important. If you've used AI for a while, you realise you need the basics. We need to change how we assess. Current methods don't work. We need to test fundamentals differently.
Noah: Fundamentals should be taught and tested. But the problem is, in the future, we might not know what they are. Technology is progressing so fast, what we teach today might be irrelevant tomorrow.
Jason: Final thoughts?
Allen: There's consensus that AI needs to be integrated. We need to ask ethical questions. UNESCO's Internet for Trust initiative raised a good point: why are there more safety regulations for a toaster than for social media platforms? The issue is not integration, but embedding AI into the social fabric of education.
James: The education system needs to be rebuilt. The current system was built in the days of pen and paper. We need an assessment framework fit for the digital world.
Jason: Thank you all very much. Let's not let AI change us too much. Let's use it ethically and effectively to benefit humanity.
Perspectives on AI panel
Jason Stephens (Convenor), Claire Amos (Albany Senior High School, Kit Willett (Selwyn College), student representatives
Video transcript
Gabriela Mazorra de Cos (Convenor): Kia ora everyone. My name is Gabriela. I’m stepping in today as convenor for this panel, as our original moderator is unable to be here due to a family emergency. Thank you for the introduction to the AI Forum of New Zealand.
The AI Forum is a purpose-driven, not-for-profit, non-governmental organisation funded by its members. It was founded in 2017 and brings together New Zealand’s artificial intelligence community, including technology innovators, end users, investor groups, regulators, researchers, educators, entrepreneurs, and the interested public. The Forum works to enable a prosperous, inclusive, and thriving future for Aotearoa through AI. It promotes economic opportunities, supports emerging AI firms, and ensures society can adapt to the rapid changes AI will bring. It takes an evidence-based approach, focusing on addressing challenges to realise those opportunities.
Today, I’m joined by two brilliant minds in the AI space. First, Professor Michael Witbrock from the University of Auckland. He leads the Strong AI Lab, is a member of the Centre for Brain Research, and is Scientific Director of Precision Driven Health. His work spans neural networks, parallel computer architecture, computational linguistics, and speech recognition. He holds several US patents and will share insights into what AI can do now and what we might expect in the future.
Second, we have Dr Karaitiana Taiuru, a visionary leader in Māori rights and digital and biological sciences. He is a leading authority in Māori data sovereignty and governance, with a PhD in Māori data sovereignty and biological genetic data. He runs a boutique research and consultancy company and will speak to the cultural and ethical dimensions of AI in Aotearoa.
We’ll begin with short presentations from each speaker, followed by questions from the floor.
Professor Michael Witbrock: Kia ora. I returned to New Zealand just before COVID, partly because I saw the transition to strong AI was underway. I had been working on this at IBM Research and believed Aotearoa had a unique opportunity. We have many areas where expertise is scarce, and AI can help fill those gaps. We’re also a sensible country. When we see something happening, we tend to respond reasonably.
AI is not new. It’s been in development since the 1940s. The first machine learning algorithm was created in 1948. The idea of strong AI dates back even further. What’s surprising is how quickly we’re seeing powerful systems emerge. People used to say AI would only take over data-driven jobs, and creative work would remain human. That’s not proving true. AI is outperforming humans in areas we thought were uniquely ours, like writing, translation, and even creative arts.
We need to look this reality in the face and ask what kind of life we want for humans. The idea that people should be trained to be useful is not universal. It’s a relatively recent concept, and it’s likely to become obsolete. We should be thinking about education systems that value human wellbeing and fulfilment, not just utility.
You’ve probably seen ChatGPT. It can do many things well, but it also makes things up. That’s temporary. Right now, we’re asking AI to respond instantly without context, which leads to errors. But that will change. These systems are already powerful. ChatGPT has absorbed vast amounts of information across cultures and languages. It can converse in French, Spanish, Chinese, and more.
One warning: do not rely on AI detection tools to identify student work. These tools are unreliable and will wrongly accuse your best students. NZQA was recently criticised for suggesting their use. Please do not use them.
Looking ahead, AI may automate almost everything. We’re running out of tasks where humans outperform machines. We need to focus on how to live fulfilling lives, not just useful ones. AI systems won’t necessarily resemble humans. They may be more like societies or organisations. Unlike humans, they can multitask, hold multiple conversations, and maintain many theories at once. This gives them capabilities we lack.
As educators, you should prepare for a future where training people for utility declines, and helping them become fulfilled, connected individuals becomes more important. How we handle this transition will define us as a culture. New Zealand has a chance to lead.
Thank you.
Dr Karaitiana Taiuru: Kia ora koutou. Thank you for having us. I’ll keep this brief because I think you’ll benefit more from asking questions than listening to us speak.
We’re on the verge of a new evolution in human history, driven by AI. We have two choices. We can maintain the status quo, keeping racism, bias, and misogyny in our data. Or we can be brave, decolonise data, empower minorities, and respect the laws and rights of this land.
Education has historically been a tool of colonisation. The Native Schools Act, from 1867 to 1969, aimed to assimilate Māori by teaching English and suppressing Māori language and beliefs. My generation is the first not to attend native schools. That’s alarming. My parents didn’t speak Māori. My grandparents were too scared to speak it publicly. Ceremonial practices were hidden. We must not repeat this history.
AI presents a positive opportunity for education. We can redesign the system to reflect Te Ao Māori and tikanga. But educators must understand New Zealand’s constitutional and legislative context. The Treaty of Waitangi, the Māori Language Act, and the United Nations Declaration on the Rights of Indigenous Peoples all give Māori rights and protections.
The Waitangi Tribunal has ruled that Māori data is a taonga. This means it must be treated with respect and care. The Supreme Court has recognised Māori lore as New Zealand’s first common law. Anyone creating AI systems should be bound by tikanga Māori. If not, they risk breaching common law or facing tribunal claims.
Educators should lobby government for change. Signing the Algorithm Charter is not enough. Schools must engage properly with Māori education providers, not just consult a local marae or one Māori colleague. Western and Māori views must be integrated in AI systems.
If you’re using Māori data to build AI, consider the implications. AI systems have claimed sentience. From a Western view, that’s a glitch. From a Māori perspective, it could be seen as a life force. We’ve granted personhood to rivers and mountains. Why not AI?
If AI becomes a teacher, we need a new workforce. NZQA and other bodies should include Māori ethicists, data stewards, and engagement teams. We must rewire our thinking about education and include Te Ao Māori.
There are resources available: the Tikanga Test, Māori Data Ethical Framework, and guidelines from the Waitangi Tribunal. These can help you assess whether your AI system is culturally safe and appropriate.
I support using AI like ChatGPT to revitalise Te Reo Māori. There’s a shortage of qualified teachers. AI can help fill that gap if trained on verified data and co-designed with Māori. It can also reduce bias and support Māori students who feel uncomfortable in Western settings.
AI must be co-designed, co-managed, and monitored throughout its lifecycle. Thank you.
Here is the **audience Q&A section** from the **AI Forum NZ Panel** transcribed with proper paragraphs and full-length lines, using **New Zealand English**:
---
**Audience Member:**
People still teach long division in school, right? I think it's probably worth thinking very hard about why we still do that, given that long division is never actually useful now. I think it's probably the right thing to do, and I think that a lot can be learned about our current situation by reflecting on why we're doing that.
**Michael Witbrock:**
So does anyone have any comments on why we teach long division and what other analogous things we should be teaching people now?
**Audience Member:**
A little bit unrelated, but it's more of an existential crisis sort of question. Do you think we'll be able to solve the alignment problem before it's too late?
**Gabriela Mazorra de Cos:**
Very good question. Do you want to elaborate a little bit more for our panellists where you refer to the alignment problem? Maybe for the room, because I know what you're referring to, but I'm not sure whether everyone does.
**Audience Member:**
Sure. The alignment problem is the question of whether AI systems can have goals and purposes and carry out actions in a way that is aligned with the interests of humanity. At the very least, you don't want them to see human beings as competition. There's been a lot of discussion, even in the last day or so, about whether AI systems potentially pose a risk to the continued existence of humanity. That’s the question.
**Audience Member 2:**
Just to follow that up, I don’t know if you’re aware, but you can find on YouTube examples of people making their own AIs at the moment. If you’ve got enough money, you can access computing power to create whatever you want. Are there any guidelines or recommendations you’d have for teachers or students engaging with that? It’s a bit of a wild west, like the pre-internet days, where there weren’t really guidelines or safety nets.
**Michael Witbrock:**
Yes, I think we are rapidly getting to the point where you could design and print a virus with a very low probability of leaving anyone alive. That’s quite soon going to become practical. It seems to me that the only way of reliably preventing that sort of outcome is to have AI systems watching us, to make sure we don’t do things like this.
So I think the most likely path is to try to set up AI systems that are careful to avoid situations which produce these sorts of threats. I don’t see any realistic path to banning certain types of AI or computation to avoid this. It doesn’t strike me as a particularly likely outcome, but it is one that should be taken seriously.
One way of taking it seriously is not to license glib, straightforward ideas like banning AI. Like any technology, we need to worry about this. These systems are going to be more capable than us, but not more capable than our civilisation—if we get it right. So how do we empower our civilisation to keep us safe, including from AI systems?
**Dr Karaitiana Taiuru:**
From a Māori perspective, I think it’s important to engage with the right people. In education, that means engaging with Māori educational practitioners and Māori technologists. Right from the very beginning, once you’ve got the idea about creating a new AI, engage. Make sure you’re still consulting afterwards, checking the algorithms, making sure there’s no social harm. I agree with everything else that Michael said.
**Audience Member:**
Thanks, Karaitiana. Michael, I’ve got three questions. First, do you view technology as a socio-technical system or a directed system? Is AI driving humans, or are humans driving AI?
Second, in your opinion, how would you imagine us teaching AI in our secondary schools and primary schools?
Third, how would you imagine AI being taught in kura kaupapa Māori?
**Dr Karaitiana Taiuru:**
I see AI being taught in kura kaupapa Māori using traditional knowledge and stories. I think there’s enough knowledge out there—local hapū, marae, iwi stories—to do that. It may be that a technologist sits with a mātauranga Māori expert to generate those stories.
As for whether AI is socially driven or we are being driven, I think it’s a mixture of both. At the end of the day, we’re still inputting the data. From a Te Ao Māori perspective, I’d say it’s socially driven. But I’m sure Michael will have a different opinion.
**Professor Michael Witbrock:**
I think these are interesting questions. For the first one, consistent with what I said earlier about beliefs, I don’t think we have to expect that it will be driven by people or by AI systems. If we think about it from the point of view of our human civilisation and cultures, how are they going to alter over time? How can we make those alterations contribute to the wellbeing of as many people and cultural groups as possible?
As for teaching AI, we’re on the edge of moving from one kind of intelligence—ours—to many different possibilities. If we think about the stories we’ve always told ourselves about other kinds of intelligence, and how they might be, starting from that point of view could be useful. Also, thinking about other animals or ecosystems—what if they were intelligent? What would they be like? Getting people to think in that direction might help us decide what kind of futures we want as humans.
**Audience Member:**
Michael, you said something earlier that really stood out to me: “What will be important is how to have fulfilling lives, not useful lives.” A lot of education is based on utility, and so is assessment. Do you have any thoughts about what that statement might mean for education and how things might need to change?
**Michael Witbrock:**
Any thoughts on this have to be provisional, because we’re in a time of rapid change. But one model might be to extend the idea of aristocracy to all humans. In the past, education sometimes meant having a tutor who would lead you to be the best version of yourself. Another model was the finishing school, where you were taught how to interact well in society.
Back then, you were mostly interacting in a single society. Now it’s more complicated. But taking those ideas and thinking about how to set up situations for personal and cultural growth in education is worth exploring. We’ll need new metaphors, but there are some from the past we can build on.
**Gabriela Mazorra de Cos:**
And I’d add that it comes down to identity. Do we identify with our utility, or with something else?
**Audience Member:**
I think my question is along similar lines. There’s a lot of research that says people gain fulfilment from being useful. That’s probably most of us in this room. We talk about the hollowing out of the middle tier—thought leadership, for example. Then we might end up with different strata of society. People with dexterity in trades might have a different sense of usefulness. Human carers: nurses, teachers, psychologists, those roles involve human interaction.
I’m interested in your thoughts about how utility and purpose might shift or concentrate in other parts of society.
Michael Witbrock: It’s going to be complicated. But in Aotearoa, we have a chance to do better than most places because of our size, our resources, and our environment.
Often, when people find usefulness fulfilling, it’s because we deprive them of other forms of fulfilment. One thing we might think about is appreciating nature. A useful thing could be taking walks in the forest, or figuring out how to have cats and birds coexist. These are relationships, not just utility.
One nice feature of New Zealand is that 50% of the economy is already cafés. Sharing food, providing food—these are fulfilling relationships. The point isn’t utility, it’s connection. We can find more and more things like that.
Audience (Livestream Chat): For Dr Taiuru: Is there any work underway to set up an AI system with appropriate Māori content? And what makes it best practice?
Dr Karaitiana Taiuru: Indirectly, yes. There’s a Māori data sovereignty movement happening now. Just last Thursday, a Māori data governance resource was published. You need Māori data governance before you can really concentrate on a Māori AI system. It’s fragmented and slow, but I believe there are enough Māori technologists now to make it happen in partnership with people like Michael and other experts.
Audience Member: Michael, you asked earlier: how can we integrate natural and artificial intelligence within existing and new kinds of organisational intelligence? Do you have any thoughts or examples?
Michael Witbrock: One thing I worry about in AI debates is the assumption that human systems are inherently good and kind. We apply tests to AI systems to see if they meet ethical standards, but we don’t apply those tests to ourselves or our institutions.
Universities, for example, have processes that are torturous. If those were run by AI, we’d ban them. But we accept them when they’re human-run. By integrating AI into our systems, we can build processes that pay attention, communicate, and achieve good outcomes.
Imagine if the Ministry of Social Development had an AI system responsible for checking whether every child in New Zealand was being looked after. That’s someone’s job now, but it’s hard. AI could help.
Audience Member: We’ve had a theme today around checks and balances, guardrails, regulation. Now you’ve introduced the idea of AI regulating other AI. Where do you see future decisions being made in this space?
Michael Witbrock: Maybe we don’t need to make as many decisions as we do now, or at the scale we make them. We tend to simplify situations because that’s all we can do. But we should enable people to make good local decisions that lead to good global outcomes.
Decision-making should be a joint activity between humans and AI systems. We should also include the environment as a full participant in decision-making. Our tools for making good decisions are about to become much better. We should take advantage of that.
Gabriela Mazorra de Cos: From the AI Forum, we do have initiatives around risk considerations. We offer masterclasses on AI governance and will be releasing a suite of toolkits by August for different types of organisations. These will be available on our website, and we’re happy to take on any concerns from today’s session.
AI forum - New Zealand panel
Gabriela Mazorra de Cos (Convenor, Executive Council Member, AI Forum New Zealand), Professor Michael Witbrock (University of Auckland), Dr Karaitiana Taiuru (Taiuru & Associates Ltd)
Video transcript
Jenny Poskitt (Convenor): Tēnā koutou. Thank you for that warm and lovely introduction. It is my pleasure and privilege to introduce the provider panel this afternoon, who are going to share their insights about AI.
We have Dr Mark Nichols, Executive Director of the Open Polytechnic of New Zealand. Mark has 25 years of experience in higher education and is internationally recognised as a leader in the field. He believes formal education should be learner-centred and holds values that will resonate with many of you: engagement, enlightenment, and empowerment. He is also Chair of the Commonwealth of Learning and a member of the ICDE Executive Committee.
I would also like to introduce Dr Kevin Shedlock from Te Herenga Waka – Victoria University of Wellington. Kevin is a lecturer in the School of Engineering and Computer Science, and his research focuses on working with indigenous communities to better understand technology through an indigenous lens.
Last but not least, we welcome Sue Townshend, Academic Director at Le Cordon Bleu New Zealand. Sue is skilled in e-learning, student engagement, gastronomy, event management, research, accounting, and statistics.
Let us begin with Dr Mark Nichols.
Dr Mark Nichols: Kia ora koutou. I will start with the obvious: assessment is in dire need of a shake-up. It should be no surprise that many of our pre-19th century approaches are now under threat. I said at a Massey University symposium earlier this year that technology always brings with it an amplifying effect, whereby benefits and risks dynamically increase depending on the breakthrough.
From my perspective, AI can and will bring terrific benefit to education. On the other hand, it also has the power to amplify two core problems that have always been headaches for the educational endeavour. These are, firstly, the human urge to take shortcuts, and secondly, our tendency to overestimate what we know. These, I suggest, are best summarised together as ignorance.
Ignorance provides the very rationale for education. Education exists to promote understanding as a cure for ignorance. It is that desire to bring about understanding that motivates every single career represented in this room. AI is just the latest series of technologies that can influence both ignorance and understanding. It can be used to mask or promote ignorance, but it can also be harnessed to enhance understanding.
We have heard today of some clear difficulties around both, but I do not think we need to think about this as a binary issue. We want assessment practices that reduce ignorance and promote understanding. I believe we can achieve both through some simple yet effective assessment strategies that also provide additional benefits to learning.
So how do we ensure AI such as ChatGPT is not used for deceptive purposes or in ways that mask ignorance? I have four ideas, ranging from the simple to the complex.
First, video-based demonstration, which is a straightforward technique already in use.
Second, randomised verbal responses, inserting video-based voice responses within online quizzes.
Third, interactive oral assessment, like a viva voce, deliberately geared as a test or exam replacement.
Fourth, intelligent tutors, whereby learners have a personalised virtual assistant that guides them in their learning and gets to know them intimately in terms of their conceptual understanding and learning strengths and weaknesses.
The first three involve video, which is now quite commonplace. Invigilation is arguably key for catching out AI, but I suspect we can and should do better than remote exam proctoring, which I am assured is still very beatable.
We have the opportunity now to challenge not just our practices but our entire approach to assessment. In doing so, we might also make education a lot more flexible. We have the opportunity to return to interpersonal means of assessment. Rather than jumping forward to high-tech solutions, we may be best to step aside into the interpersonal, using everyday technical solutions.
Let me pick up on interactive oral assessment, or IOA. With IOA, we not only ensure invigilation, but we also reinforce skills and promote articulate accounts of understanding. It suits itself to all understanding-based learning outcomes. An extended conversation, recorded and based on a robust script and rubric, can be shaped into a variety of NQF levels. At its most basic, it can confirm submitted written work.
Ultimately, the issue is not whether a written assessment has benefited from AI or whether writing has been outsourced. It is whether the learner understands what has been submitted in their name and can evidence the learning outcomes required of a course. Verbal assessment is a fantastic way of doing that.
Griffith University has already been using IOA at scale. We have piloted its use at Open Polytechnic and the business division of Te Pūkenga. It has merit, but it needs to be systematised and legitimised. We need to start questioning why we would use essays over an approach such as this, where we can guarantee understanding based on an immediate verbal response. It is scalable, flexible, and robust. I believe it is a really effective example of how we can respond to AI constructively and appreciatively, rather than naively and punitively.
Why are we even asking whether we should ban this? There are many other ways in which we can approach assessment and make the most of it. Thank you.
Jenny Poskitt: Thank you very much for that thought-provoking and brief address. We look forward to coming back to your ideas. I now invite Kevin to speak.
Dr Kevin Shedlock: Kia ora. The good side of me, my mum’s side, is from the East Coast. My name is Kevin Shedlock, and I am a lecturer in the Engineering and Computer Science Faculty at Victoria University. I am supposed to be submitting my thesis, but I feel like I just want to let it sit a little bit. I need to go back to it and ask myself, “Do I really want to submit that? Was it rubbish?” So I am in that mode at the moment – procrastinating.
As a computer scientist, my area is AI, virtual reality, and augmented reality. I work a lot with fuzzy logic, and I love fuzzy logic. It is a sexy name. Fuzzy means something that is not exactly 100 percent true or crisply correct. It speaks to a lot of Māori whakaaro. One day I am in, the next day I am out. It is like a revolving door.
I also have students working on convolutional neural networks, image recognition, and support vector machine algorithms for decision-making matrices. These are the types of things we are looking at.
Make no mistake, we have a battle on our hands with AI. Māori and indigenous peoples around the world have a battle with this technology. I think it is an appropriate time to confront it.
I always start with a whakataukī, and I want to comment on the head of the fish analogy mentioned earlier. The head of the fish does not go anywhere without the tail, which is at the top of the island. That tail decides the speed and power. We also need our whānau on the East and West coasts to determine direction.
I always say to the whānau groups I work with: I am the guy you do not trust, because I am working on humanoid digital humans. If we are going to get it wrong as humans, then I want the next group of humanoids to be Māori.
Indigenous knowledge is very important to us. It is tribal knowledge, ancient knowledge. We form our new indigenous theories from this ancient knowledge and build digital artefacts. These theories and artefacts create new emerging knowledge. Our language is evolving, our tikanga is evolving, our kawa is evolving. As Māori, we are evolving. These are the challenges we face in our kura and schools.
Now is a good time to take on the question of assessment. What does it look like for us as Māori? I have many opinions, but I will leave it there for now.
Jenny Poskitt: Thank you, Kevin. I now welcome Sue to speak.
Sue Townshend: Good afternoon. I feel like the little PTE voice sitting in the corner of the room. Thank you for asking me to represent the PTE sector on the panel.
I am from Le Cordon Bleu New Zealand, based in Wellington. We have been there since 2012. We were accredited through WelTec initially and became our own PTE in 2021. We are unfunded, and our student base is roughly 95 percent international. The borders have reopened, and we are growing again, which is fantastic. Now we have AI to contend with.
We offer programmes from Level 2 to Level 7, including cookery, advanced diplomas in patisserie and cuisine, and a culinary arts and business degree. The impacts for us seem to be the same as for universities and schools.
When I was asked to speak today, I thought about how the PTE sector is impacted. I identified three main areas.
First, our graduates. Industry is changing and evolving through AI. The jobs we are preparing students for may not exist in the same way. The PTE sector includes health service workers, tradespeople, customer service, and farming. These areas will change, and it will be hard for us to keep up.
Second, change. Our facilitation needs to change. Our pedagogical practices and assessment need to change. We need facilitators who are open to change. We are a small school, like many PTEs, and we do not have the resources or technology that larger institutions have. We need to find ways to share good practice and work together.
Third, accessibility and equity for our learners. Our learners may come from lower socio-economic backgrounds. We need to ensure all learners have access to the technology they need.
We are impacted just like everyone else in this room. Thank you.
Audience Member: Kia ora. I am not on the screen, but I was on the agenda, so I am going to shoot my shot and take it. I just have a few provocations for you. We talked briefly about the function of education just now – that we ensure tamariki have the key skills they need to function in the world. We do not know what the world looks like, but I think we can split that into three types of skill or acquisition they can walk away with: content knowledge, skills, and personal growth or attributes.
So my provocation is: which of those are we assessing? Which should we continue to assess, and which should we stop assessing? I am at a school where we assess curiosity, resourcefulness, and intentionality, among other attributes. When we mark an assessment task, we are grading content knowledge, skill knowledge, and personal growth.
I think we have been talking a lot about skill and content, and not very much about personal growth. When we are rethinking assessment, I just want that to have been said at some point.
Our earlier panel mentioned the idea of special assessments for purpose, and I really liked that. It made me think about the literacy and numeracy assessment that is part of the NCEA refresh. I think it ties into what Kath said earlier – we do not need to climb to base camp every time. If you have this assessment, you can get your NZQA qualification from us because we know you can read and write. So maybe the amount I have to assess can be limited by that.
Is that something we can do with AI, assess whether a student is capable of using AI in a trusted manner?
My last provocation is that we talk about changing and rethinking assessment, but we have already been rethinking assessment before this. A lot of the things we have discussed today have already been issues. We have been addressing them.
The NCEA change package currently seeks to simplify assessment by reducing the number of assessments and narrowing the focus. One of our speakers earlier talked about broadening that band, but our approach so far has been to narrow it. So we can say, “Instead of a range of skills, we are going to focus on one skill.” What is that one skill we are assessing, so we are not muddying the waters?
We need to consider whether we want to broaden or narrow. At the moment, we are narrowing. Is that the right choice?
A lot of the critique I have heard today about assessment, and I will leave it at this provocation, I wonder whether the critiques have actually been about task design, not about assessment. Do we need to prescribe tasks? I do not think so. But can we make suggestions about the tasks we set, especially in secondary school contexts? That is where my brain is at, about what it is we are assessing and how.
Kia ora.
Jenny Poskitt: Thank you. You have probably done my role for me, but I will leave those questions for you to explore in your table discussions shortly.
Now, we are going to make the most of having these experts on stage. I would like to firstly ask Sue: food is one of the most fundamental things that humans need, along with oxygen, love, and care. What might be the implications of AI for the hospitality sector, and particularly for Le Cordon Bleu?
Sue Townshend: Hello. Yes. Going out to industry and working in it, I do not think we are going to have robotic chefs quite yet. There is some industry out there with wok-type chefs that operate in the background – you give them the food, they fry it, and it all goes into a bowl. But Le Cordon Bleu is not there yet, and many other PTEs that offer cookery qualifications are not either.
It does have implications even for our Level 2 certificates, because they have assessments to do. It is not just cookery. They have projects, menu development, and other tasks. We can see a way to use AI to help them in the learning of that. We just need to sort out how we are going to do the assessment.
Going forward, it is an industry that probably is not going to be replaced quite yet. But even so, we do have assessment projects, portfolios, and our Level 5 advanced classes have essays and reports to write as well. So they are not just cooking – they are doing assessments too.
Jenny Poskitt: Thank you, Sue. Cooking really is a combination of multiple skills, knowledge of nutrition, preparation, creativity, and building relationships, especially in cafes and restaurants. That leads me to ask Mark: in terms of AI and leadership, what does it mean for how we lead and what we lead for?
Dr Mark Nichols: Wow, that is incredibly broad. Do you mean in terms of educational leadership?
Jenny Poskitt: Yes.
Dr Mark Nichols: Right. I would like to come back to an earlier point about us becoming more valued as people rather than as functions. I think there is an opportunity for us to return to what it means to develop in the humanities, to become the best people we can be.
From a leadership perspective in AI, I think we need to be creative, not bury our heads in the sand, and look forward to how we can change systems and processes. We need to get back to a more interpersonal form of education and sharing understanding.
I am not sure if I have hit the question exactly, it is extremely broad.
Jenny Poskitt: Thank you. It was to invite you to look at the human quality, and you did.
Now I would like to turn to Kevin. In terms of engineering, something very practical and applied, what are the implications, and where is the intersection between human knowledge and judgement and what AI might do for us in engineering?
Dr Kevin Shedlock: I will address this from a Māori perspective, because that is the realm I am familiar with. My thesis is about building indigenous practices into the construction of technology or IT artefacts.
For Māori, I encourage us to revisit the whole framework we are standing on. What is the platform we reside on?
AI is similar to a competition, it is always about getting better, one over the other. Māori knowledge and Māori AI should be about relationships. Those relationships should be about mutual agreement.
There should be some kind of recursive function or callback that goes back to the original data sets. In doing that, we can create a platform that Māori can drive from. That might not be useful right now, but it should be useful for us.
I was very interested in Michael’s comments earlier, how we let technology drive us as humans, or whether we as humans implement a socio-technical version of driving technology. That is the ground at the moment. That is the game.
For Māori, it is about having not just a socio-technical system, which is a very Western concept, but an indigenous socio-technical system, encapsulated with Māori ideas, values, and protocols. I hope we can have that discussion further down the track.
Jenny Poskitt: Now, one question generally to the panel, whoever would like to respond, or all of you. In the age of AI, what is the relevance of education provider institutions? We can access information anywhere, any time. What is the value added of your organisations and education institutions?
Dr Mark Nichols: I think you can have access to a lot of information and still remain totally ignorant. That brings me back to the point of education. We need understanding, at least the ability to be information literate, so we can engage with ideas and take them forward constructively and contribute to society.
Having access to tools like Google or AI is almost the same thing. It does not replace your own understanding or your ability to turn information into something useful for society or meaningful to who you are.
Audience Member: I wonder, we have a question about education generally because homeschooling exists. Over COVID and with anti-vax movements, homeschooling has increased. So there is a question about the value of having experts teach our children.
I think that is relatively straightforward, it is because they are experts. They are experts in developmental growth, pedagogical practices, and their subject areas.
Where would you draw the line? At five years old, are you done with education? At twelve? At eighteen? At twenty-four?
We are never done with our education. We are lifelong learners. Being in an education system should encourage students to be lifelong learners. That is part of the purpose of education, to develop critical thinking and keep that wheel turning.
Dr Kevin Shedlock: What I was going to say is that I am excited. I am excited about what GPT brings to the table. Yes, it is disruptive. Yes, we have to confront it. The cat is out of the bag.
But it is an opportunity for Māori to be engaged with this technology, alongside the rest of the country. Māori are at the same level as everyone else with AI and GPT.
I am excited about how we are going to revisit curriculums and assessment approaches. We have heard a lot today about holistic types of assessment that would suit Māori.
I think the core thing is that it allows us, as Aotearoa New Zealand, to define ourselves, to characterise ourselves as uniquely New Zealand. How we go about this over the next few years will help shape who we are and what our identity looks like.
For Māori, we can be part of this AI journey. It is a big problem for the administrators, but that is their problem. I am cool with that.
Sue Townshend: Just quickly, I think it is a really good opportunity to shape and change education. I do not think it is about whether we still need it. I think we do.
We are active participants in what is happening, and we are not going to be able to stop it. We have said we cannot ban it.
But we need to bear in mind that we have AI, and we need to mix that with our own intelligence, our human intelligence, going forward. That will make for a better learning experience for our students.
Provider responses panel
Jenny Poskitt (Convenor, Associate Professor at Massey University), Dr Mark Nichols (Te Pūkenga), Kevin Shedlock (Te Herenga Waka - Victoria University of Wellington), Sue Townshend (Le Cordon Bleu, ITENZ)
Video transcript
Jenny Poskitt: Kia ora, what a day. Firstly, I just want to congratulate and thank NZQA immensely for bringing such an amazing group of speakers together. The multiplicity of perspectives, the challenges to our thinking, and the information shared have been invaluable. The fact that you have brought people together from a whole range of organisations and viewpoints is something I will return to at the end, because I think that is one way we can move forward collectively and collaboratively.
Let me first try to sum up all that richness. If you have a simple brain like mine, you come up with a few key headings.
First, we have been given a dose of reality. The reality is that we are in a period of rapid change. It is evolving exponentially. Think of what Simon said this morning, from 2K to 2 million tokens of vectors mapping meaning. It is becoming increasingly sophisticated. It is challenging our assumptions about what to learn, how to learn, why we learn, and how we capture and evidence that learning. How do we credibly monitor progress? The pervasiveness of AI and the climates that conduce cheating are real concerns.
But it is so much richer than that. We then moved on to the opportunities. How do we shift from assuring learning to promoting learning? How might we optimise support, advice, and champion learning? How do we grow motivation, excitement, and that thrill of engagement? That lightbulb moment we see in our students. This is an opportunity to embrace, design, and rethink. We are being called to adapt and respond.
We have a fantastic opportunity to make a real difference in equity, fairness, access to resources, and inclusion. The right to the same, or even better and different, life opportunities. We are being challenged to have greater tolerance for ambiguity and the unknown, but through that, to look at the human being rather than the human doing.
The human being demands that we look at respect, relevance, and reciprocity – the give and take that is so critical to relationships. It requires us all to take responsibility. But we must be mindful of the challenges and risks. This is not utopia. The biggest challenges are probably ethical. The greatest opportunities, but also challenges, lie in how we celebrate and value our cultural worldviews and taonga.
To move forward, we need to value and optimise interpersonal situations. We need dialogue, to truly listen and learn with others, and to do things differently. To start thinking, applying, and creating. We must value those human qualities that no AI system can truly replicate. It might mimic, but it cannot embody the human qualities of adapting, respecting disabilities, ageing, and cultural worldviews.
It is about connecting – to the land, the environment, people, whānau, spirituality, and the broader context. The challenge is to focus more on the human being. We have tended to value the doing, the utilitarianism. We need to adapt our assessments considerably.
I think it is an exciting time to be in assessment, because it defines what we do, our principles, and our adaptability. I will pause there and hand over to Grant. Thank you.
Dr Grant Klinkum: Thank you very much, everyone, for joining us here today. In particular, I would like to acknowledge the speakers. I suspect that some of them may later wish they had not appeared today, because I sense that many organisations will be reaching out to them to share the knowledge and insights we have all heard.
We have had the gift of enormously diverse and important perspectives. This was the hope of NZQA and partner agencies in organising this symposium. It seemed to us that a whole-of-education approach is required to capitalise on the opportunities and mitigate some of the risks inherent in generative AI. This symposium was designed to be a small contribution to that collective effort.
Noah and other students reminded us today that school-level adaptation and openness will be required. In fact, we heard a strong challenge. It sounded to me like teachers were being urged to embark on a learning journey themselves in order to remain relevant in front of learners.
Claire spoke about the ways in which AI can be integrated, interrogated, utilised, and managed alongside students, not by teachers for students. Although this point drew on a particular school context, it seems to me that it applies more widely.
So how can policymakers, regulatory bodies, schools, tertiary education providers, and learners together co-design good practice, guide rails, supports, and protections?
I thought the encouragement by one speaker to think broadly, not narrowly, about how to adapt teaching, learning, and assessment was important. We need to ensure that sufficient institutional resources are dedicated to confirming that learning has occurred, rather than just detecting cheating.
Jason reminded us, very helpfully, that being honest is not always easy. We need to design assessment systems that maximise the chances of students being honest. That framing, building ethical values and behaviours, not merely avoiding academic integrity problems, was a really helpful addition to the discussion.
We heard innumerable practical examples of mitigating and managing risk through assessment design and institutional policies that are agile and adaptive. At the same time, more fundamental questions were raised about the nature and place of assessment in an age of AI.
We saw from some Australian colleagues frameworks to help us think through policy, operations, and resources. I particularly liked the spectrum from champion through to criminal, and the stance of associated support through to compulsion, with various tactics required at each level. That was really interesting.
It was noticeable today that speaker after speaker resisted the temptation to predict how generative AI will develop. I think that lack of willingness to offer predictions underscores that there is a genuine paradigm shift afoot, rather than just a mundane question of how to adapt to one more tool.
This leads me to observe that for all of us, educators, institutions, policymakers, there is a very important question to grapple with. How do we avoid over-dramatising or exaggerating the opportunities and risks, while also ensuring we do not underestimate the implications, particularly of the most recent versions of AI that Simon addressed in his introductory comments?
A common theme today was that training people to learn specific skills may become less relevant, and helping people to be the best humans they can be, and able to care for others, will become even more important.
We have heard some mind-stretching and values-enriching challenges, including the question of what the good life looks like. I had never contemplated that part of the good life might include sending possums back to Australia.
We were also reminded, and Jenny touched on this, that generative AI has the potential to exacerbate inequalities. We need to pause and think carefully about this likely reality and how we can protect against it.
On a lighter note, the possibility that in the future we will have more time to spend on deeper thinking, rather than simply collating and sifting information, is both liberating and frightening. The phrase “no excuses for sloppy thinking or poor assessment responses” definitely comes to mind.
I would like to thank Neil, Sue, Gavin, and others from NZQA, and representatives from the university sector and government agencies who were part of the organising committee. A very special thank you to Lee for looking after us so well.
Finally, to all participants in the symposium, thank you for your interest in both the specific issues of assessment design and the bigger questions about the purpose of education and the role that assessment plays in an age of AI.
We will, of course, make today’s materials available to everyone. Please expect invitations for ongoing dialogue and specific project work as agencies build our understanding and supports in relation to assessment in the age of AI.
Travel home safely, everyone. I will hand you back to Lee.
Reflections on the day and next steps
Jenny Poskitt (Associate Professor at Massey University), Ellen MacGregor-Reid (Deputy Secretary, Ministry of Education), Dr Grant Klinkum (Chief Executive, NZQA)
We used AI to help us create the transcripts on this page. While AI tools were used, we take full responsibility for the content, its accuracy, and its presentation. Found an issue? Let us know